Infiniband subnet management Discuss the Infiniband subnet management system Discuss fat tree and...
-
Upload
rachel-sutton -
Category
Documents
-
view
226 -
download
1
Transcript of Infiniband subnet management Discuss the Infiniband subnet management system Discuss fat tree and...
Infiniband subnet management
• Discuss the Infiniband subnet management system• Discuss fat tree and subnet management in an
Infiniband with a fat tree topology.• References
• A. Bermudez, R. Casado, F.J. Quiles, T. M. Pinkston, J. Duato, “Evaluation of a Subnet Management Mechanism for Infiniband Networks”, ICPP 2003.
• A. Vishnu, A. R. Mamidala, H. Jin, D. K. Panda, “Performance Modeling of Subnet Management on Fat Tree Infiniband Networks using OpenSM”, Workshop on System Management Tools on Large Scale Parallel Systems, Held in Conjunction with IPDPS 2005
• X. Lin, Y. Chung, T. Huang, “A Multiple LID Routing Scheme for Fat-Tree-Based Infiniband Networks.” IPDPS 2004.
• Infiniband devices and entities related to subnet management
• Devices: Channel Adapters (CA), Host Channel Adapters, switches, routers
• Subnet manager (SM): discovering, configuring, activating and managing the subnet
• A subnet management agent (SMA) in every device generates, responses to control packets (subnet management packets (SMPs)), and configure local components for subnet management
• SM exchange control packets with SMA with subnet management interface (SMI).
• Subnet management packets (SMP)– 256 bytes of data– Use unreliable datagram service on the
management virtual lane (VL 15)– LID routed: use lookup table for forwarding
• Use after the subnet is setup. E.g. Check the status of an active port
– Direct routed: has the information of the output port for each intermediate hop.
• Subnet discovery for the subnet is setup
• Subnet management packets (SMP)– Define the operation to be performed by SM– Get: get the information about CA, switch, port– Set: set the attribute of a port (e.g. LID)– GetResp: get response– Trap: inform SM about the state of a local node
• A SMA stop sending Trap message until it receives TrapRepress packet.
• Topology information can be obtained by a sweep and by peridical Traps.
• Subnet Management phases:– Topology discovery: sending direct routed SMP
to evert port and processing the responses.– Path computation: computing valid paths
between each pair of end node– Path distribution phase: configuring the
forwarding table
• Subnet discovery– SM starts by sending a direct routed Get SMP to its
local node. Upone receiving response, SM sends SMPs with additive depth.
• Path computation:– Compute paths between all pair of nodes
– For irregular topology: • Up/Down routing does not work directly
– Need information about the incoming interface and the destination and Infiniband only uses destination
– Potential solution:
» find all possible paths
» remove all possible down link following up links in each node
» find one output port for each destination
– Why is that still working? No clear to me.
– Other solutions: destination renaming
– Fat tree topology:• What is the best that can be achieved is also not clear.
• Path distribution:– Ordering issue: the network may be in an
inconsistent state when partially updated, which may result in deadlock during this period.
• Traditional solution, no data packets for a period of time
• deadlock free reconfiguration schemes.
• Fat Tree:– A way to build large scale clusters
• Fat Tree:– Routing in a complete fat tree: a up phase and a
down phase (always contention free).
• Fat Tree:– The complete fat tree has the scalability
problem• The root has a very large nodal degree
– How to build a fat tree with nodes that have a constant nodal degree.
– M-port n-tree FT(m, n)• m is the number of port per switch
• n+1 is the height of the tree
• The tree consists of 2*(m/2)^n processing nodes and (2n-1)*(m/2)^(n-1) switches.
– How is an m-port n-tree FT(m, n) connected?• m is the number of port per switch
• n+1 is the height of the tree
• The tree consists of 2*(m/2)^n processing nodes and (2n-1)*(m/2)^(n-1) switches.
• A processing node is labeled as P(p_0 … p_{n-1}),– P_0 = 0..m-1, p_i (i!=0) = 0..m/2-1
• A switch is labeled as SW<w_0…w_{n-2}, l>– l = 0..n-1,
– When l=0, w_i = 0..m/2-1
– When l!=0, w_0 = 0..m-1, w_I (I!=0) = 0..m/2-1
• Figure 5: a 4-port 3-tree
• How is the tree connected?– SW<w, l>_k be the kth port of SW(w, l).– SW(w, l)_k and SW(w’, l’)_k’ is connected iff
• l’ = l + 1
• w_0…w_{n-3} = w_0’…w_{l-1}’w_{l+1}’…w_{n-2}’
• k=w_l’, k’ = w_{n-2} + m/2
– Question: which switches are connected to SW<001, l>? How the ports are connected?
• Fat tree properties:– Multiple routes between two nodes
• Deterministic routing: one path between two nodes, how to map?
– What is a good mapping?• In the case when the traffic pattern is unknown,
common practice is to minimize the maximum load on a link.
– Do we know how to do it?• Not clear even when there is no restriction on the
routing. It is likely that an optimal solution exists for a particular FAT tree topology
• In infiniband, destination based routing put some restriction on which path can be used.