Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

download Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

of 19

Transcript of Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    1/19

    QsNet III An HPC Interconnect

    for PetaScale Systems

    Duncan Roweth, Quadrics LtdISC08 Dresden June 2008

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    2/19

    Quadrics Background

    Develops interconnect products for the HPC market HPC Linux systems AlphaServer SC systems

    Quadrics is owned by the Finmeccanica group

    Quadrics will be 12 years old in July

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    3/19

    Interconnect Network QsNet

    QsNet IIINetwork Multi-stage switch network Evolution of the QsNet II design Increased use of commodity hardware Increasing support for standard

    software

    QsNet IIIComponents ASICs Elan5 and Elite5 Adapters, switches, cables Firmware, drivers, libraries Diagnostics, documentation

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    4/19

    Fabric

    Bridge

    x8

    PLL

    EEPROM ClocksPCIe16 Lanes

    Host I/F

    TLB

    Cmd Launch

    PCIe

    SERDES

    Local Functions

    Buffer Manager

    Object Cache Tags

    Free List

    Local Memory

    Ext i/f SDRAM i/f

    External cache

    ExternalDDRII

    16K x 8 x 8 banks = 1MB ECC RAM

    CX4/QSNet III

    Link

    CX4/QSNet III

    Link

    Packet Engine16K inst cache9K data buffers

    Packet Engine16K inst cache9K data buffers

    Packet Engine16K inst cache9K data buffers

    Packet Engine16K inst cache9K data buffers

    Packet Engine16K inst cache9K data buffers

    Packet Engine16K inst cache9K data buffers

    Packet Engine16K inst cache9K data buffers

    Elan5 Adapter

    Elan5 Adapter Overview

    2 25 Gbit/s QsNet III links PCIe, PCIe2 host interface Multiple packet engines 512KB of high bandwidth on

    chip local memory SDRAM interface to optional

    local memory Buffer manager, object

    cache

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    5/19

    QsNet III Adapter Overview

    QM700 PCIe x16 128MB adapter memory 2 QSFP links

    Half height low profile

    Adapters variants PCIe Gen2 Blade formats 10Gbit/s Ethernet 10GBase-CX4

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    6/19

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    7/19

    QsNet III Adaptive Routing

    Packet by packet dynamic routing Single cycle routing decision

    Selects route based on Link state, errors etc

    Number of pending acks High radix switches 2 routing decisions for 2048 nodes

    More flexible than QsNet II

    Operates on groups of links

    Can adaptively route up or down

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    8/19

    Bandwidth scalability 1024 nodes

    Bandwidth achievedwhen 1024 nodes allcommunicate at thesame time

    QsNetII

    provides betteraverage bandwidthand much narrowerspread in best to worstcase performance

    System Interconnect Min Max Average

    Atlas Infiniband 95 762 263

    Thunder QsNet II 248 403 369

    Data from Lawrence Livermore National Lab, published at the Sonoma OpenFabrics workshop June 2007

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    9/19

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    10/19

    QsNet III Federated Network Switches

    Node switch chassis 128 links up 128 down

    Same chassis provides multiple

    top switch configurations: 64 4 512-way systems 32 8 1024-way systems 16 16 2048-way systems 8 32 4096-way systems

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    11/19

    QsNet III Network 4096 way

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    12/19

    QsNet IIIcables

    QSFP connectors throughout Optical cables (e.g.Luxtera), 5-300m

    PVDF Plenum rated LSZH available as an option

    Active copper cables (Gore), 8-20m Copper cables (Gore) 1-10m No longer Quadrics proprietary

    Bit error rates are a big issue at 5 Gbpsand above Optical cables between switches Short copper cables from nodes

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    13/19

    QsNet III for HP BladeSystem

    Elite5 switch moduleFull bandwidth16 links to the blades (via backplane)16 links to back of the module

    Elan5 mezzanine adapter2 QsNet linksPCI-E x8 (initially)

    128 MB of memory

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    14/19

    2048-way QsNet III BladeSystem Network

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    15/19

    Building a 16K node system in 2009/10

    Single water cooled rack willprovide 1000-2000 standardcores ~12-25 TF.

    8 Blade switches per rack Connect 128 of these racks

    with 1024-way top switches

    Single fibre cable per node -for full bi-section bandwidth.

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    16/19

    QsNet III Fault Tolerance

    All of the QsNet II Features CRCs on every packet Automatic retransmission Adaptive routing avoids failed links Redundant routes Redundant, hot plugable, PSUs and fans

    + Full line rate testing of each link as it comes up Switches generate CRPAT, CJPAT or PRBS packets Links are only added to the route tables when they are (a)

    up, (b) connect to the right place, and (c) can transfer datawithout error.

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    17/19

    Software Model Firmware & Drivers

    Base firmware in the ROMs Firmware modules loadable with the device driver

    Elan, OpenFabrics, 10GE Ethernet,

    Kernel modules elan5, elan, rms

    Device dependent library (libelan5) Device independent library (libelan)

    User libraries

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    18/19

    Point-to-point messagepassing

    One-sided put/get Transparent rail striping

    Optimised collectives Locks and atomics ops Global memory allocation

    Software Model Elan Libraries

  • 8/14/2019 Quadrics QsNetIII an HPC Interconnect for PetaScale Systems - Presentation

    19/19

    Focus on the most demanding HPC applications Delivers large system scalability

    All nodes achieve host adapter bandwidth at the same time Minimal spread between best and worst case performance

    Low and uniform latency Highly optimised collectives

    Single supplier of interconnect hardware, software, support Stability of our products

    Track record of delivering production systems European company

    Why Quadrics?