Netspeed whitepaper

download Netspeed whitepaper

of 8

Transcript of Netspeed whitepaper

  • 8/11/2019 Netspeed whitepaper

    1/8

    NetSpeed G EMINI

    A Scalable, Coherent

    Network-on-Chip Solution

  • 8/11/2019 Netspeed whitepaper

    2/8

    NetSpeed G EMINI: A Scalable, Coherent, Network-on-Chip Solution 2

    The last few decades have seen a massive growth in the number of CPU cores, computing clusters and

    other IP blocks in an SoC. This massive growth along with the need for complex chip integration has

    driven the need for sophisticated interconnects. SoC architects have employed a variety of methods -

    from buses to crossbars to hand crafted NoCs with Lego TM like blocks with varying degrees of success.

    The increase in number of agents accessing a critical resource like memory has also meant the shared

    data needs to be managed to ensure cache coherency. This coherency can be achieved either through a

    software-based coherency solution or a hardware-based coherency solution. Factors including

    performance, power, and time-to-market make hardware-based coherency the preferred solution.

    However, existing hardware-based coherency solutions have two key limitations on performance and

    scalability. First, coherency systems are usually fixed configurations, which means they cannot adapt to

    your system requirements. They may be over-designed or under-performing. Second, to manage the

    complex on-chip communications, they employ separate interconnects for coherent and non-coherent

    traffic. This creates unnecessary floor planning obstacles, prevents efficient resource sharing, requiresmultiple interconnect methodologies, and requires additional hardware support to allow the traffic to

    interact. NetSpeed G EMINI addresses these issues effectively through a unique scalable coherency

    architecture and a sophisticated fabric that handles both coherent & non-coherent traffic.

    NetSpeed G EMINI is the second product in the family of NoC IPs from NetSpeed Systems. It is a high-

    performance, scalable, coherent NoC solution. It supports all three levels of coherent traffic cache

    coherent, I/O coherent & non-coherent traffic in a single NoC. NetSpeed G EMINI provides full cache

    coherency for small & large systems from 1 to 64 coherent CPU clusters and 1 - 200 I/O & non-cached

    agents. NetSpeed G EMINI NoCs deliver high performance & significant time-to-market advantages to SoC

    designers for a wide range of markets from mobile, networking to high-performance computing.

    Executive Summary

  • 8/11/2019 Netspeed whitepaper

    3/8

    3 www.netspeedsystems.com

    CACHE COHERENCY TECHNIQUES

    Complexity in multi-core SoCs has increased

    dramatically over the last few years - the

    number of CPU cores & other compute agents

    like GPUs and DSPs has increased both in

    numbers and complexity. In these SoCs, access

    to memory is the critical performance

    bottleneck. To address this complexity, modern

    SoCs have adopted multiple layers of memory

    caching from a local or cluster-level L1, L2 cache

    to a system-level L3 cache.

    As the number of caches

    increases, the method ofkeeping these caches

    coherent with each other

    and the main memory has

    also become difficult.

    Cache coherence is

    addressed through two

    main techinques -

    Software-based coherency

    and Hardware-based coherency. In the software-based coherency model, the programmer is

    tasked with maintaining memory coherency,

    dealing with stale memory and invalidating

    cache & memory lines. Hardware-based

    coherency utilizes a coherency protocol and

    hardware support to automatically maintain

    coherency in the system.

    HARDWARE COHERENCY ADVANTAGE

    Complexity of software-based coherency

    systems grows with the number of agents as well

    as the kinds of agents in an SoC. With increasing

    use of heterogeneous architectures &

    sophisticated workloads, software-based

    coherency solutions do not scale. As shown in

    the figure below, this has led to increasing

    percentage of software costs incurred in

    developing systems. Hardware-based

    coherency, on the other

    hand, has three distinct

    advantages:a. Reduces Power :

    Caches do not need to be

    flushed when passing data

    between agents

    b. Increases system

    performance:

    Sharing data requires no

    additional software

    overhead, and fine-grain sharing is possiblec. Reduces Software Complexity:

    Coherency is transparent to the software,

    allowing direct sharing of data without the

    need for software maintenance.

    Hardware-basedCoherency solutions

    Reduces Power andincreases Overall

    system Performance

    Situation

  • 8/11/2019 Netspeed whitepaper

    4/8

    NetSpeed G EMINI: A Scalable, Coherent, Network-on-Chip Solution 4

    NetSpeed GEMINI AN INTRODUCTION

    NetSpeed G EMINI is the second product in the family

    of NetSpeed's Network-on-chip IP products.

    NetSpeed G EMINI is a fully cache-coherent, high-

    performance NoC IP. NetSpeed G EMINI uses an

    innovative directory-based approach to address the

    issue of scalability in multicore and multi-cluster SoC

    systems. SoC architects can build small and large

    coherent interconnect systems. Using G EMINI,

    architects can connect anywhere from 1 to 64 fully-

    cache coherent CPU clusters, GPU blocks and other

    coherent compute blocks. It also supports 1 to 200 I/O coherent and non-coherent agents. Currently,

    NetSpeed G EMINI supports AMBA 4 agents with future revision planned to support AMBA 5.

    NetSpeed G EMINI uses the underlying NetSpeed NoC technology allowing it to deliver a customized NoC

    for any given SoC specification. Many traditional approaches separate out coherent and non-coherent

    traffic, creating inefficient resource sharing and additional hardware support to handle the two

    interconnects. NetSpeed G EMINI, on the other hand, handles both coherent and non-coherent traffic

    seamlessly in a single underlying fabric. It also uses a number of proven algorithms to optimize the SoC

    interconnect, providing a high-performance, coherent Network-on-chip solution. Finally, NetSpeed G EMINI

    uses graph theory and formal techniques to ensure that there are no protocol-level or network-level

    deadlocks in the entire system.

    Solution

    SoC Architects can

    connect up to64 coherent CPU clusters

    and up to 200 I/O andNon-coherent agents

  • 8/11/2019 Netspeed whitepaper

    5/8

    5 www.netspeedsystems.com

    NetSpeed NocStudio ARCHITECTUREEXPLORATION PLATFORM

    NetSpeed G EMINI is configured and optimized using NocStudio - a NoC architecture exploration platform

    and design compiler. NocStudio takes detailed user specifications & uses machine learning algorithms to

    identify the ideal topology needed while solving complex SoC issues like QoS & Deadlock avoidance.

    NetSpeed G EMINI design flow uses placement-aware optimizations to tailor the topology and its channel

    and buffer sizing are fully heterogeneous. Broadly, NocStudio has three main steps in the design flow:

    1. SPECIFY: NocStudio takes high-level SoC specifications like components & their connectivity,

    performance requirements (bandwidth, latency, power), coherency requirements (coherency bandwidth,

    protocol, participation level) and other SoC requirements like Quality of Service (QoS).

    2. OPTIMIZE: NocStudio performs many optimizations to construct the on-chip network.

    Coherency Controller Optimization: Based on the coherency bandwidth requirements, NocStudio

    automatically identifies the number of coherency controllers needed in the system as well as other

    Gemini coherency IP blocks needed for the SoC like NCB (Non-cache Bridge) and DVM. Automatic Topology Generation: Based on floorplan, connectivity & performance specifications,

    NocStudio will triangulate to the correct NoC topology, such as a bus, mesh or even a heterogeneous

    topology. Routes for various flows between IP blocks are selected during NoC configuration to reduce

    latency, meet bandwidth requirements, and minimize power and area.

    Layer Optimization: NetSpeed G EMINI supports up to 8 physical layers & 32 virtual networks. These

    layers & networks are fully heterogeneous and are optimized to meet end-to-end requirements.

    A cycle-aware performance simulator is available to characterize performance of the NoC.

    3. GENERATE: The final step in the design flow is used to generate synthesizable RTL along with C++

    functional models, detailed performance statistics and sanity verification test benches.

    Design Flow

  • 8/11/2019 Netspeed whitepaper

    6/8

    NetSpeed G EMINI: A Scalable, Coherent, Network-on-Chip Solution 6

    SCALABLEARCHITECTURE ANDSPECIALIZEDACCELERATORS 1. SCALABLEARCHITECTURE: NetSpeed G EMINI achieves scalability through multiple design dimensions.

    a. Coherency Bandwidth: The number of

    coherency controllers needed for a SoC is

    automated and is determined based on the

    coherent bandwidth needed for the

    system. Employing multiple coherency

    controllers enables more coherent lookups

    per cycle.

    b. Directory Structure: The directory

    structure used in G EMINI is a unique,

    scalable directory. Typical directories grow on the order of O(n 2) with number of agents as more

    entries are needed and each entry must track more caches . However, NetSpeeds directorysolution grows close to linearly with increasing number of entries and agents. The G EMINI directory

    is built to reduce power by limiting the number of associative ways, while using advanced

    directory encodings and management to maintain peak performance levels.

    c. Underlying Interconnect Architecture: NetSpeed G EMINI underlying NoC scales with increasing

    traffic in the SoC. This is achieved through the use of multiple physical layers in the NoC.

    2. SPECIALIZEDACCELERATORS: NetSpeed G EMINI includes an accelerator for ordered coherent traffic called

    the Non-cache Bridge. It achieves higher ordered throughput by performing coherent lookups in parallel

    while ensuring completion occurs in the specified order. GEMINI also includes hardware support for

    Distributed Virtual Memory (DVM), enabling memory management operations to be distributed to allrequired agents.

    Employing multiple coherencycontrollers enables

    more coherent lookups &increases coherent bandwidth

    Features

  • 8/11/2019 Netspeed whitepaper

    7/8

    7 www.netspeedsystems.com

    NOC PLATFORM , CORRECT-BY-CONSTRUCTION & USER-CONFIGURABILITY 1. NOC PLATFORM: The unique architecture of NetSpeed G EMINI allows it to scale performance to match

    both the growing number of IP blocks and increasing design complexity. This allows NetSpeed IP to be

    used as NoC platform for entire product families. The underlying hardware elements of NetSpeed G EMINI,like the coherency controller, coherency directory & router modules, are designed to support higher

    throughput with low footprint & power. Using these elements, efficient NoCs can be built for a variety

    of SoCs, from mobile to enterprise networking and high performance computing.

    2. CORRECT-BY-CONSTRUCTION NOC: NetSpeed

    GEMINI uses patent-pending algorithms to design

    NoCs that are correct-by-construction. It uses

    graph theory & formal techniques to ensure that

    there are no cycles in the entire message

    dependency chain. It captures dependencies

    from protocol requirements, traffic flows, and

    user specification. The combined dependency

    specification is used to ensure full deadlock

    avoidance both at protocol & network level.

    3. USER -CONFIGURABILITY: Many existing coherency solutions are fixed-point solutions leading to system

    designs that may be under-performing and over-performing. However, NetSpeed G EMINI is a fully

    configurable & customizable coherent NoC IP. NetSpeed G EMINI is configured and optimized using

    NocStudio - a NoC architecture exploration platform. Using NocStudio, SoC designers can describe theirinterconnect specifications at a high level such as floorplan, connectivity, bandwidth and latency. In a

    user controlled and automated design environment, a number of interconnect design choices can be

    rapidly generated, evaluated and benchmarked.

    Benefits

    Patent-pending Algorithms And formal methods

    to design NoCs that arecorrect-by-construction

  • 8/11/2019 Netspeed whitepaper

    8/8

    Copyright 2014, NetSpeed Systems. All rights reserved. No part of this document may be reproduced, stored in a retrieval system, transmittedin any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without express written permission from NetSpeedSystems. The information contained herein is subject to change without notice. All other trademarks mentioned herein are the property of their

    respective owners.

    A SCALABLE, COHERENT, NETWORK-ON-CHIP SOLUTION The growing number of computing blocks in an SoC, increasing design complexity and the paradigm shift

    towards hardware-driven coherency have created a need for scalable, coherent interconnect solutions.

    NetSpeed G EMINI effectively addresses these needs. It uses a number of proven algorithms to optimize

    interconnects, providing a scalable, high performance, correct-by-construction Network-on-Chip

    solution. NetSpeed G EMINIs coherency architecture is based on an innovative directory that scales the

    number of coherency modules depending on high-level SoC specifications while dramatically reducing

    area and power.

    About NetSpeed Systems

    NetSpeed Systems providesscalable, coherent, on-chipnetwork IPs to SoC designers fora wide range of markets frommobile to high-performancecomputing and networking.NetSpeed's on-chip network IPsdeliver significant time-to-market advantages through asystem-level approach, a highlevel of user-driven automationand state-of-the-art algorithms.

    Corporate Headquarters

    2670 Seely Avenue,San Jose, CA 95134

    Know more

    For more information about thisand other products, contact:

    [email protected]

    Conclusion