Ikki Fujiwara, Michihiro Koibuchi National Institute of...

24
Ikki Fujiwara, Michihiro Koibuchi National Institute of Informatics Hiroki Matsutani Keio University Henri Casanova University of Hawaii at Manoa IPDPS 2014 / May 20th, 2014 / Phoenix, Arizona, USA

Transcript of Ikki Fujiwara, Michihiro Koibuchi National Institute of...

Page 1: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Ikki Fujiwara,

Michihiro Koibuchi National Institute of Informatics

Hiroki Matsutani Keio University

Henri Casanova University of Hawaii at Manoa

IPDPS 2014 / May 20th, 2014 / Phoenix, Arizona, USA

Page 2: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

The Light Speed is Fixed

2014-05-20

2

Koibuchi Lab @ National Institute of Informatics

c ≈ 0.3 m/ns c ≈ 0.2 m/ns

= 5.00 ns/m

Page 3: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Switch Delay is Continuously Decreasing

2014-05-20

3

Koibuchi Lab @ National Institute of Informatics

1 hop =

÷ 5 ns/m =

140 ns

QLogic 12300

28 m

200 ns

Cisco SFS7000D

40 m

60 ns

A future product

?

12 m

Switch delay will no longer dominate the end-to-end

communication latency

Switch delay

Equivalent

cable length

Page 4: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

What Happens in the Future

2014-05-20

4

Koibuchi Lab @ National Institute of Informatics

0.8

1.6

2.4

3.2

0 60 120 180

Maxi

mu

m late

ncy

s]

Switch delay [ns]

Random

degree=11

diameter=5

Hypercube

degree=11

diameter=11

Traditional Hypercube outperforms the same-degree

Random topology!

Page 5: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Topology Design Trends

2014-05-20

5

Koibuchi Lab @ National Institute of Informatics

Geometrical Design Topological Design

Ring+Random [Koibuchi et al. ISCA12]

HyperX [Ahn et al. SC09]

Jellyfish [Singla et al. NDSI12]

Skywalk

Torus / Hypercube

Page 6: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Introduction

Skywalk construction

Intra-cabinet links

Inter-cabinet links

Graph analysis

Cycle-accurate simulation

Conclusion

Agenda

2014-05-20

6

Koibuchi Lab @ National Institute of Informatics

Page 7: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Intra-cabinet Links

2014-05-20

7

Koibuchi Lab @ National Institute of Informatics

Switch Hosts (compute nodes) *

Cabinet

* Hereafter the hosts are omitted

1 Randomly connect the switches in each cabinet — Possibly fully connected

Page 8: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Inter-cabinet Links

2014-05-20

8

Koibuchi Lab @ National Institute of Informatics

Floor

Cabinets

2 Randomly connect the

cabinets in each row

Page 9: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Inter-cabinet Links

2014-05-20

9

Koibuchi Lab @ National Institute of Informatics

3 Randomly connect the

cabinets in each column

4 Randomly connect the remaining cabinets (optional)

2 Randomly connect the

cabinets in each row

Page 10: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Skywalk Construction

2014-05-20

10

Koibuchi Lab @ National Institute of Informatics

4 Randomly connect the remaining cabinets (optional)

3 Randomly connect the

cabinets in each column

2 Randomly connect the

cabinets in each row

1 Randomly connect the switches in each cabinet — Possibly fully connected

Page 11: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Skywalk Details

Parameters

z = Number of switch in each cabinet

c = Number of cabinets

di = Number of intra-cabinet links at a switch

do = Number of inter-cabinet links at a switch

d = di + do = Total degree

Cyclic linking

Inter-cabinet links are connected to one of the switches in that

cabinet in a cyclic manner

Fastest routing

Packets choose lowest-latency paths (not shortest-hop paths)

2014-05-20

11

Koibuchi Lab @ National Institute of Informatics

Page 12: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Standpoints of Skywalk and Dragonfly

2014-05-20

12

Koibuchi Lab @ National Institute of Informatics

Geometrical Design Topological Design

Ring+Random [Koibuchi et al. ISCA12]

HyperX [Ahn et al. SC09]

Jellyfish [Singla et al. NDSI12]

Torus / Hypercube

Dragonfly 2-layer hierarchical meta-topology

with intra-group and inter-group

sub-topologies

Skywalk A Dragonfly instance

• group = cabinet

• intra-group: random

• inter-group: random

Page 13: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Introduction

Skywalk construction

Graph analysis

Switch delay vs. latency

Degree vs. latency

Total cable length vs. latency

Network size vs. latency

Cabinet size vs. latency

Cycle-accurate simulation

Conclusion

Agenda

2014-05-20

13

Koibuchi Lab @ National Institute of Informatics

Page 14: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Graph Analysis: Setup

Parameters: (unless otherwise specified)

z = 8 switches/cabinet

c = 256 cabinets arranged in a 16×16 grid

N = 2,048 switches in total

Switch delay = 60 ns

Packet injection delay = 300 ns

Featured topologies:

Skywalk fully connected for intra-cabinet

Random d-degree uniform random

Torus 3-D (8×16×16) or 5-D (8×4×4×4×4)

HyperX tailored to map onto the floorplan

Dragonfly group=cabinet, fully connected for both intra- and inter-group

See the proceeding for average latency

2014-05-20

14

Koibuchi Lab @ National Institute of Informatics

Page 15: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Switch Delay vs. Latency

2014-05-20

15

Koibuchi Lab @ National Institute of Informatics

* HyperX is omitted. See the proceeding for complete results.

0

0.5

1

1.5

2

2.5

3

3.5

0 100 200 300 400 500

Maxi

mu

m late

ncy

s]

Switch delay [ns]

3-D Torus

d=6

Hypercube

d=11 Random

d=11

Skywalk

d=11

Dragonfly

d=39

0.5

0.6

0.7

0.8

0.9

0 20 40 60

Skywalk leads to the lowest latency with ultra-low-delay

switches and also with high-delay switches

d = degree

Page 16: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Degree vs. Latency

2014-05-20

16

Koibuchi Lab @ National Institute of Informatics

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

0 8 16 24 32 40

Maxi

mu

m late

ncy

s]

Degree

5-D Torus

HyperX

Random

Skywalk

Dragonfly

* Skywalk with di = {1, 4} and Hypercube are omitted. See the proceeding for complete results.

Skywalk leads to a desirable tradeoff between degree and

latency

Page 17: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Total Cable Length vs. Latency

2014-05-20

17

Koibuchi Lab @ National Institute of Informatics

0.7

0.8

0.9

1

1.1

1.2

1.3

1.4

1.5

0 200 400 600

Maxi

mu

m late

ncy

s]

Total cable length [km]

5-D Torus

HyperX

Random

Skywalk

Dragonfly

* Skywalk with di = {1, 4} and Hypercube are omitted. See the proceeding for complete results.

Skywalk saves 90% cable length over Dragonfly with only

19% higher maximum latency

Page 18: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Network Size vs. Latency

2014-05-20

18

Koibuchi Lab @ National Institute of Informatics

0.5

0.6

0.7

0.8

0.9

1

1.1

1.2

1.3

128 512 2048 8192

Maxi

mu

m late

ncy

s]

#Switch

Skywalk

d=8Skywalk

d=16 Skywalk

d=32

Skywalk

d=64

Dragonfly

d=9

3-D Torus

d=6

Dragonfly

d=39

Dragonfly

d=135

Dragonfly

d=15

* Hypercube is omitted. See the proceeding for complete results.

Skywalk scales well with relatively low degree

d = degree

Page 19: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Cabinet Size vs. Latency

2014-05-20

19

Koibuchi Lab @ National Institute of Informatics

0.6

0.7

0.8

0.9

1

1.1

1.2

2 8 32 128

Maxi

mu

m late

ncy

s]

#Switch/cabinet

Skywalk

d=8

Skywalk

d=16

Skywalk

d=32

Skywalk has an optimal cabinet size because it becomes

similar to Random with very large or very small cabinets

d = degree

Page 20: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Introduction

Skywalk construction

Graph analysis

Cycle-accurate simulation

Throughput vs. latency

Conclusion

Agenda

2014-05-20

20

Koibuchi Lab @ National Institute of Informatics

Page 21: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Cycle-accurate Simulation: Setup

Topology parameters: h = 8 hosts/switch

z = 4 switches/cabinet

c = 64 cabinets arranged in an 8×8 grid

N = 256 switches in total

Switch delay = 60 ns

Simulation parameters: Adaptive deadlock-free routing

4 virtual channels

256 bits/flit × 33 flits/packet = 8,448 bits/packet

96 Gbps/switch ÷ 8 hosts/switch = 12 Gbps/host max.

Random uniform traffic

See the proceeding for: Bit reversal traffic

Matrix transpose traffic

2014-05-20

21

Koibuchi Lab @ National Institute of Informatics

Page 22: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Cycle-accurate Simulation: Result

2014-05-20

22

Koibuchi Lab @ National Institute of Informatics

* HyperX is omitted. See the proceeding for complete results.

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

1.1

0 2 4 6 8 10 12

Late

ncy

s]

Accepted traffic [Gbit/sec/host]

Skywalk

d=11

Hypercube

d=8

Dragonfly

d=19

3-D Torus

d=6Random

d=11

Skywalk achieves low latency and higher throughput than

Random at lower degree than Dragonfly

d = degree

Page 23: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Introduction

Skywalk construction

Graph analysis

Cycle-accurate simulation

Conclusion

Agenda

2014-05-20

23

Koibuchi Lab @ National Institute of Informatics

Page 24: Ikki Fujiwara, Michihiro Koibuchi National Institute of ...research.nii.ac.jp/~koibuchi/pdf/ipdps2014-skywalk-print.pdfIkki Fujiwara, Michihiro Koibuchi National Institute of Informatics

Wrap-up

The speed of light affects topology design once ultra-low-

delay switches are put into practical use

We propose the “Skywalk” topology that uses randomness

in a layout-conscious way

Skywalk achieves desirable tradeoffs between end-to-end

latency and degree or cable length

Cycle-accurate simulation show that Skywalk provides not

only low latency but also high throughput at low degree

2014-05-20

24

Koibuchi Lab @ National Institute of Informatics

Geometrical Design Topological Design

Skywalk