Voltaire ufm en_nov10
-
Upload
sciecomp -
Category
Technology
-
view
762 -
download
1
Transcript of Voltaire ufm en_nov10
© 2010 Voltaire Inc.
November 19, 2010
Unified Fabric Manager Overview
Ghislain de Jacquelot
© 2010 Voltaire Inc. 2
Voltaire Software Portfolio
Robust RDMA Drivers
Fabric provisioning and
performance monitoring
Robust Drivers
MPI Acceleration
Multicast Acceleration
Storage Access Acceleration
Collective communication
offload
Multicast and TCP transport utilizing
Kernel bypass technology
RDMA based storage iSCSI target
Multicast and TCP
transport utilizing
Kernel bypass
technology
© 2010 Voltaire Inc. 3
Unified Fabric Manager
“So far, I haven't seen any other solutions claiming to be a "fabric manager" offer the sophisticated insight, resource management, performance trending, and core fabric function extension that UFM can … it fully illustrates what a well architected fabric should be capable of.”
Jeff Boles, Taneja Group, June 2009
© 2010 Voltaire Inc. 4
Infiniband Traditional Management
© 2010 Voltaire Inc. 5
An Infiniband Fabric is not a black box (1/2)
► Requires Hardware management
• Detect failures, communication problems
Inside the Infiniband Fabric
- Port counters
- Port status (QDR,DDR,SDR – 4X,2X,1X)
- Firmware upgrades (Switch and HCA ASICs)
Outside the Infiniband Fabric
- Chassis
- Power supplies
- Fans
- Temperature
- Chassis software updates (Switch management)
© 2010 Voltaire Inc. 6
An Infiniband Fabric is not a black box (2/2)
►What about performance ?
►Some embarrassing questions…
• Blocking vs non-blocking fabrics ?
• Influence of routing algorithms ?
• Congestion ?
• Mixing different protocols on the same fabric ?
• Running multiple jobs on the same fabric ?
• Performance monitoring Tools ?
© 2010 Voltaire Inc. 7
UFM Central Management Platform
► In-depth visibility into fabric health and traffic
• Central Dashboard, Unique Congestion Map
• Advanced monitoring engine, threshold based alerts
► Optimize application performance
• Quality of Service
• Traffic Aware Routing Algorithm
► Efficient operations of thousands of fabric components
• Automated configuration of hosts and switches, group tasks
• Seamless change management
Unified
Fabric
Manager
© 2010 Voltaire Inc. 8
Introducing UFM
UFM Server
CLIGUI
(Java)
Web
Services
IB-SM
(OpenSM)
Perf Mng
Providers
Device Mng
Providers
SQL
DB
HA
Daemon
Access
Control
Central administration
of multiple switches
(or hosts)Hierarchal performance
monitoring,
variety of sources
Leverage open
source SM
engine
Transparent
fail-over
Fast retrieval,
historical data
Manage complex
relations and
workflows
Voltaire
Plug-ins
User and
application
interfaces
© 2010 Voltaire Inc. 10
Advanced Monitoring and Analysis
► Monitor & analyze fabric performance
• Bandwidth utilization
• Unique congestion monitoring
• Dashboard for aggregated fabric view
► Real-time fabric-wide health monitoring
• Monitor events and errors through-out the fabric
• Threshold based alarms
• Granular monitoring of host and switch parameters
► Innovative congestion mapping
• One view for fabric-wide congestion and traffic patterns
• Enables root cause analysis for routing, job placement or resource allocation inefficiencies
► All is managed at the application/aggregation level
• Event effects are clearly visible
• Pro-active measures can be taken
© 2010 Voltaire Inc. 11
Central Dashboard
Resource Utilization
& StatusCongestion Map
Top 10 alerted nodesEvent Pane
Top 10’s
B/W, Congestion
B/W Consumers
© 2010 Voltaire Inc. 12
Advanced Monitoring Engine
Multiple sessions
On demand
Sessions per Logical
Groups – no need to
know physical nodes
Aggregation per
Multiple devices
Various graphs (linear,
bar, historgram, pie…)
Correlate switch and
host information
Formulas (AVG, Max,
Min, Sum)
© 2010 Voltaire Inc. 13
Performance Optimization Cycle with UFM
Characterize
traffic pattern and priorities
Unique logical fabric model
QoS to prioritize critical apps.
Optimize routing with Voltaire’s
Traffic Optimized Routing (TOR)
Show traffic and congestion
information
Unique Congestion Map
Feedback and Analysis
OptionalOrchestrators
& Schedulers
Application Requirements
UFM OptimizationUFM Monitoring
© 2010 Voltaire Inc. 14
Advanced Performance Optimization Mechanisms
► Fabric virtualization and Quality of Service (QoS)
• Run multiple clusters or multiple jobs on the same infrastructure
• Assure critical applications get priority through QoS policy
• Provide the required isolation for different departments or jobs
► Traffic Aware Routing Algorithm (TARA)
• Voltaire’s major shift from static to traffic aware routing
• Routing enhancements are built on top of OpenSM in a modular plug-in architecture
• Takes into consideration traffic patterns and loads
• Traffic model can be derived automatically from fabric model or via API with 3rd party schedulers
Applicable to both DDR and QDR Environments
© 2010 Voltaire Inc. 15
Congestion Example
► Degradation due to node oversubscription
• Destination port in saturation (multiple sources)
• Congestion spread across the fabric
• ALL other flows drop to 20% of capacity
• Take time to recover
• Common with storage traffic
drop
recovery
© 2010 Voltaire Inc. 16
Quality of Service Optimization
UFM Enables QoS Optimization
© 2010 Voltaire Inc. 17
Test Environment
► 2 nodes running a latency critical job
► 12 nodes running a bandwidth consuming job
► Goal: achieve best performance with Latency critical tasks
© 2010 Voltaire Inc. 18
W/O Partitioning Latency degradation of ~215%
Latency job running alone
(Latency = ~2.1 us)Bandwidth job added on
same partition
(Latency = ~4.5 us)
© 2010 Voltaire Inc. 19
UFM Logical Model Creates Partition and Sets QoS
► 2 Logical Groups
• Latency job
• B/W oriented job
► QoS settings
► UFM creates virtual NICs, partitions and assigns Service Levels on the fabric
© 2010 Voltaire Inc. 20
With UFM QoSCross Application Interference fixed
Single job in cluster
(Latency = 2.1us)
2 jobs, UFM optimization
(Latency = 2.2us)
2nd job added
(Latency = 4.5us)
100% Better Performance Through QoS Implementation
© 2010 Voltaire Inc. 21
Optimize performance #2: routing
► Existing routing algorithms
• Are not aware of application communication flow
• They distribute paths evenly across the fabric links
► In real life, fabrics have non uniform usage
• Some endpoints “talk” a lot, some don’t “talk” at all
• Many-to-many (cluster) and any-to-many (storage) topologies
► Result
• Unbalanced fabric
• Congestion is created leading to slower performance and high latency
Congestion = Latency
© 2010 Voltaire Inc. 22
TARA Optimization
► TARA provides the following benefits:
• Reduces competition between fabric resources, thus decreasing congestion
• Increases available bandwidth, resulting in improved fabric utilization
• Delivers lower latency and shorter application runtime
► How ?
• Uses knowledge of cluster usage: logical servers, networks.
• Balances routes depending on usage
• Not based on real-time analysis of bandwidth / congestion
© 2010 Voltaire Inc. 23
Routing ?
► InfiniBand packets are ‘destination routed’ based on the Destination Logical ID (DLID) field in the header
► In IB: DLID=route (not only remote address)
► DLIDs are 16 bits
• 48K values are used for unicast
• 16K values are used for multicast
► At each switch ASIC, the incoming unicast DLID is used as an index into a Linear ForwardingTable (LFT) that returns the outgoing switchport number
• E.g. the InfiniScale III ASIC supports all 48K possible LFT entries
Out Port #
DLID
01234567891011
© 2010 Voltaire Inc. 24
The real wording should be « rearrangeably non-blocking »
36p switch
Nodes 1-18
36p switch
Nodes 19-36
36p switch
Nodes 37-54
36p switch
Nodes 55-72
36p switch 36p switch
Each link represents 9 cables
18 uplinks
54 nodes
At boot time, 3 routes are assigned to each uplink, lets assume:
19-37-55 on port #1
20-38-56 on port #2, etc…
What happens if you have a job running on nodes 1-2-3-19-37-55 ?
Unbalanced communication, congestion…
© 2010 Voltaire Inc. 25
TARA Optimization
► TARA provides the following benefits:
• Reduces competition between fabric resources, thus decreasing congestion
• Increases available bandwidth, resulting in improved fabric utilization
• Delivers lower latency and shorter application runtime
► How ?
• Uses knowledge of cluster usage: logical servers, networks.
• Balances routes depending on usage
• Not based on real-time analysis of bandwidth / congestion
© 2010 Voltaire Inc. 26
With TARA
36p switch
Nodes 1-18
36p switch
Nodes 19-36
36p switch
Nodes 37-54
36p switch
Nodes 55-72
36p switch 36p switch
Each link represents 9 cables
18 uplinks
3 nodes
job running on nodes 1-2-3-19-37-55
At job launch time, routes to nodes used by the job are balanced over
all uplinks:
19 on port #1
37 on port #2
55 on port #3
Others are unchanged
© 2010 Voltaire Inc. 30
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1.18
1.28
2.20
2.30
3.22
3.32
4.24
4.34
5.26
6.18
6.28
7.20
7.30
8.22
8.32
9.24
9.34
10.2
6
11.1
8
11.2
8
12.2
0
12.3
0
13.2
2
13.3
2
14.2
4
14.3
4
15.2
6
16.1
8
16.2
8
17.2
0
17.3
0
switch.port
po
rt w
eig
ht
0
200
400
600
800
1000
1200
1400
1600
1800
2000
1.18
1.28
2.20
2.30
3.22
3.32
4.24
4.34
5.26
6.18
6.28
7.20
7.30
8.22
8.32
9.24
9.34
10.2
6
11.1
8
11.2
8
12.2
0
12.3
0
13.2
2
13.3
2
14.2
4
14.3
4
15.2
6
16.1
8
16.2
8
17.2
0
17.3
0
switch.port
po
rt w
eig
ht
Internal ports on the line cards
traff
ic b
an
dw
idth
Traffic Optimized RoutingOpenSM
Job 147 nodes
Job 646 nodes
Job 241 nodes
Job 563 nodes
Job 371 nodes
Job 425 nodes
storage
Nodes (24)
traffic to/from storage
Average of 200MB/s per node
Internal traffic inside each job, 1000 MB/s from each node
Example: TARA with 324 nodes cluster
300 servers
24 storage nodes
Logical
Topology
Physical Topology
© 2010 Voltaire Inc. 31
Scale-out and Maintain Control on Fabric
► Dozens of switches and 1000s of nodes become a massive operational burden
► UFM automates I/O and switch configuration enabling isolation and QoS
► Central Device Management for switches and hosts
► High-availability and seamless failover of SM and UFM
► Advanced API for seamless integration in existing environments
Automatic, seamless operations save hours of configuration and set-up work
© 2010 Voltaire Inc. 32
Efficient Troubleshooting
► Dozens of traffic and health events
• Easy central drill-down to counters, alerts and events to the port level
► Configurable thresholds and criticality levels
► GUI and log level alarms
► Alerts correlated to the application level
► Alerts correlated to the DC rack level
© 2010 Voltaire Inc. 33
Open system
► Extensible architecture based on Web-services
► Open API for users or 3rd party extensions
► Expose entire fabric and datacenter object model
► Allow simple reporting, provisioning, monitoring, and task automation
► Tools already benefiting from UFM API
Scheduler integration (e.g. Moab)
UFM Support tool kit
Various command line tools/extensions to UFM
Web fabric portal
* Provided in UFM Advanced packages
© 2010 Voltaire Inc. 34
UFM Adaptive Suite- Separate UFM offering integrated with Platform LSF
Intelligent &
automatic resource
allocation
Optimize fabric
performance
Maintain
connectivity upon
changes
Central monitoring
This is the first integrated solution that correlates network fabric
management and workload management for dynamic data centers
Platform LSFService Policy
UFMFabric Provisioning
Control & Optimization
© 2010 Voltaire Inc. 35
Integration with Platform LSF - how does it work ?
Automation and Optimization
© 2010 Voltaire Inc. 36
UFM Benefits
Simple and Automated
Lowers administration tasks
time from days to minutes
Increased Performance
Reduce congestion, lower latency
Quicker application runtime
Little Fabric Visibility
Unnoticed performance degradation
Difficult to assess impact
Low Performing Unutilized Fabrics
Arbitrary routing algorithms, QoS seldom implemented
Congested fabrics, latency affected
Complex and Manual Processes
Needs admin skills
Many options left unused at all
Ineffective TroubleshootingLong troubleshooting time
Performance issues take days to analyze
Quick Issue ResolutionDashboard, Alarms, Congestion Map
Reduces downtime, high fabric utilization
In-Depth Visibility and Control
Clear health and performance visualization
Business oriented impact and root analysis
Fabrics w/o UFM UFM Customers