On Scalable Storage Area Network(SAN) Fabric Design Algorithm
-
Upload
libby-moreno -
Category
Documents
-
view
21 -
download
0
description
Transcript of On Scalable Storage Area Network(SAN) Fabric Design Algorithm
IBM T. J. Watson Research
© 2004 IBM Corporation
On Scalable Storage Area Network(SAN) Fabric Design Algorithm
Bong-Jun Ko (Columbia University)Kang-Won Lee (IBM T. J. Watson Research)Seraphin Calo (IBM T. J. Watson Research)
IBM T. J. Watson Research
© 2004 IBM Corporation
Motivation
SAN is becoming a popular solution as data amount grows fast in enterprise computing environment.
– Replaces physical connection between hosts and storages with high-bandwidth Fibre Channel switching network.
– Enables data/resource sharing across multiple hosts.
– Increases reliability and resiliency of storage system.
A scalable SAN design solution is needed.
– SAN design is currently done manually by human.
– Large-scale SAN may consist of hundreds of servers and devices.
– Finding a low-cost solution is challenging.
IBM T. J. Watson Research
© 2004 IBM Corporation
BackgroundIDC
IDC
IDC
IDC
IDC
Components of SAN
– Servers
– Storage devices
– SAN Fabric• Arbitrated loop• Switch fabric
SAN system design procedure
– Application requirement analysis (e.g., required storage, I/O rates)
– Physical constraints analysis (e.g., geographic location)
– Server/storage planning (e.g., port assignment, inter-operability)
– SAN fabric design
– Zone planning and output generation
IBM T. J. Watson Research
© 2004 IBM Corporation
SAN Fabric Design Design consideration
– Fabric cost
– Resilience upon node or link failure
– Future growth requirement and scalability
– Ease of maintenance for human administrator
SAN fabric configuration : Mesh-based vs Core-edge-based
IDC IDCIDCIDCIDC IDC
IDC
IDC
IDCIDCIDC IDC
IBM T. J. Watson Research
© 2004 IBM Corporation
General SAN fabric design problem
Input :– A set of host ports, {i}, and set of device ports, {j}.
– A set of flows, F={fij}, fij = bandwidth requirement from host port i to device port j.
– A set of switch types (# of ports, cost) that can be used
Output :– A set of switches S and a set of links L that interconnect host, device, switch ports.
Constraints :– Only given types of switches are used.– For each flow, there exists some path from host port to device port.– The aggregate bandwidth of flows does not exceed the link bandwidth.
Optimization goal : minimizing the cost of SAN fabric (switches + links)
IBM T. J. Watson Research
© 2004 IBM Corporation
General SAN fabric design problem
Input :– A set of host ports, {i}, and set of device ports, {j}.
– A set of flows, F={fij}, fij = bandwidth requirement from host port i to device port j.
– A set of switch types (# of ports, cost) that can be used
Output :– A set of switches S and a set of links L that interconnect host, device, switch ports.
Constraints :– Only given types of switches are used.– For each flow, there exists some path from host port to device port.– The aggregate bandwidth of flows in each link does not exceed the link bandwidth.
Optimization goal : minimizing the cost of SAN fabric (switches + links)
IBM T. J. Watson Research
© 2004 IBM Corporation
Core-edge SAN fabric design problem
Additional constraints :
– Only a specific type of switches are used for each level (# of hops from core switch).
– Flows are merged at host-side edge switches, and split at device-side edge switches.
– The number of edge level is bounded.
Optimization goal : minimizing the cost of SAN fabric switches.
level 1(host side)
level 0(core)
level 1(device side)
IBM T. J. Watson Research
© 2004 IBM Corporation
Challenges
f1=0.4, f2=0.3, f3=0.2, f4=…=f14=0.1
f1 f2 f3 f4 f5 f6 f7 …… f13 f14
f2 f3 f10 …… f14 f1 f4 …… f9
Fundamental constraints in assigning flows to switches
– Bandwidth limit of a link (or a port)
– Number of ports in a switch
Numerous ways to assign flows in multiple levels
Q : Which one costs less?
8 8
88
IBM T. J. Watson Research
© 2004 IBM Corporation
Challenges
f1 = … = f20= 0.05
f13 …… f20
f1 …… f7
f1 …… f7
f8 …… f14 f15 …… f20
Fundamental constraints in assigning flows to switches
– Bandwidth limit of a link (or a port)
– Number of ports in a switch
Numerous ways to assign flows in multiple levels
Q : Which one costs less?
8 8 8
8
16
16
IBM T. J. Watson Research
© 2004 IBM Corporation
Our Approach
Multi-stage, multi-level bin packing
Decompose the problem space
– Core-switch level minimization• Goal : minimize the number of ports required in core level• Pack flows into logical flow groups based on bandwidth.
– Edge-switch level minimization• Goal : minimize the total cost of edge switch fabric• Pack flow groups into physical switches in each level based
on number of ports.
– Effectively decouple the BW and # of ports constraints.
IBM T. J. Watson Research
© 2004 IBM Corporation
Bandwidth Packing
IBM T. J. Watson Research
© 2004 IBM Corporation
Bandwidth Packing
> … >f0=0.7 > f1=0.5 f2=0.2 fn=0.01
IBM T. J. Watson Research
© 2004 IBM Corporation
Bandwidth Packing
> … >f1=0.5 f2=0.2 fn=0.01
0.7
f0
IBM T. J. Watson Research
© 2004 IBM Corporation
Bandwidth Packing
… >f2=0.2 fn=0.01
0.7
f0
0.5
f1
IBM T. J. Watson Research
© 2004 IBM Corporation
Bandwidth Packing
… >
0.9
fn=0.01
f0
0.5
f1f2
IBM T. J. Watson Research
© 2004 IBM Corporation
Bandwidth Packing
b1
f0
b2
f1f2
bm
Result:– The aggregate BW of any flow group does not exceed the link BW.– No two flow groups can be merged together.– A group of k flows occupies k input ports and 1 output ports.– The number of flow groups generated is the number of ports required in
core switch.
IBM T. J. Watson Research
© 2004 IBM Corporation
16 1616
Mapping Flow Groups into Physical Switches
s1 s2 sm
16 13 10 7 6 4 3 3
IBM T. J. Watson Research
© 2004 IBM Corporation
16 16 1616
Mapping Flow Groups into Physical Switches
s1 s2 sm
16 13
7 6 4 3 3
10
IBM T. J. Watson Research
© 2004 IBM Corporation
16 16 16 1616
Mapping Flow Groups into Physical Switches
s1 s2 sm
16 13
6 4 3 3
10 7
IBM T. J. Watson Research
© 2004 IBM Corporation
16 16 16 1616
Mapping Flow Groups into Physical Switches
s1 s2 sm
16 13
4 3 3
10 76
IBM T. J. Watson Research
© 2004 IBM Corporation
16 16 16 1616
Mapping Flow Groups into Physical Switches
s1 s2 sm
16 13 10 76 43 3
IBM T. J. Watson Research
© 2004 IBM Corporation
16
Mapping Flow Groups into Physical Switches
s1 s2 sm
8
21
Higher allocation less lower-level switchesLower allocation less higher-level switches Q : Which one is better?
20
16
8
15
8
7
13
16
8
4
8 88 87
7 7 6
IBM T. J. Watson Research
© 2004 IBM Corporation
Go High or Low?
The cost of switches increases faster than linear function of number of ports.
e.g., List price (as of Aug 2004)• IBM 3534(8 ports) : $5,136• IBM 2106(16ports) : $15,511
“Bottom-Up” approach
– Start with lowest possible assignment.
– Re-assign flows to higher-level switches.
– Pack flow groups in lower-level based on reduced port counts.
– Merge lower-level switches whenever it saves cost.
– Repeat merging recursively along the switch hierarchy.
IBM T. J. Watson Research
© 2004 IBM Corporation
Reducing Edge Switch Cost
16
8
16
7 14
6
21
20
16
8
16
3 14
2
17
16
16
8
4
8
7
88
7
87
6
16
8
4
8
7
88
7
83
2164 4
8 8 7 8 8 3
IBM T. J. Watson Research
© 2004 IBM Corporation
Reducing Edge Switch Cost
16
8
16
7 14
6
21
20
16
8
16
3 14
2
17
16
16
8
4
8
7
88
7
87
6
16
8
4
8
7
88
7
83
2
8 88 8
6
169 7
8 8 8 6
4
7 7 7 5
Replaced one 16-p SW with two 8-p SW cost reduced!
IBM T. J. Watson Research
© 2004 IBM Corporation
Demo
IBM T. J. Watson Research
© 2004 IBM Corporation
Future Work
Performance analysis
– Compare with other approach, e.g., IP solver
– Derive analytical bound
– Quantify adaptability to future growth
• Open question : How much different are two trees?
Incorporate into IBM SAN design tool