Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI...

36
Fermi Cluster for Real-Time Hyperspectral Scene Generation Gary McMillian, Ph.D. Crossfield Technology LLC 9390 Research Blvd, Suite I200 Austin, TX 78759-7366 (512)795-0220 x151 [email protected] AF SBIR Program, Donald Snyder, III Program Manager Funding provided by Frank Carlen, Multi-Spectral Test

Transcript of Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI...

Page 1: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Fermi Cluster for Real-Time Hyperspectral Scene Generation

Gary McMillian, Ph.D. Crossfield Technology LLC

9390 Research Blvd, Suite I200 Austin, TX 78759-7366

(512)795-0220 x151 [email protected]

AF SBIR Program, Donald Snyder, III Program Manager

Funding provided by Frank Carlen, Multi-Spectral Test

Page 2: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

System Architecture & Approach

•  Scenes  generated  by  heterogeneous  processors,  then  transported  over  In5iniBand  to  the  projector(s)  using  RDMA  protocol  for  high  throughput  and  low  latency  

•  Network  interfaces  aggregate  data  from  multiple  heterogeneous  processors  in  high-­‐speed  frame  buffers  

•  Contents  of  frame  buffers  output  to  projector  through  FPGA  Mezzanine  Card  (FMC)  interface  

•  IEEE  1588  Precision  Time  Protocol  (PTP)  provides  global  time  synchronization  

•  Heterogeneous  processors  and  projector  network  interfaces  scale  independently  

7/20/11 Crossfield Technology LLC 2

Page 3: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Scalable System Architecture

7/20/11 Crossfield Technology LLC 3

InfiniBand Switch

Network Interface

Network Interface Adapters

Projector

HWIL

DVI

LVDS Fiber

Processor CPU/GPU

Processor Nodes

Page 4: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

HWIL Simulation System

7/20/11 Crossfield Technology LLC 4

CPU CPU DDR3

SDRAM DDR3

SDRAM

GPU GDDR5 SDRAM GPU

GDDR5 SDRAM

Network Adapter

PCIe Bridge

Network Adapter

CPU DDR3

SDRAM

FPGA DDR3

SDRAM

PHY

Projector / HWIL

1U-4U Heterogeneous

Processor

1U Crossfield

Network Interface

InfiniBand Switch (36-648 ports)

PCIe Bridge

QuickPath Interconnect (QPI) ~100 Gbps PCI Express x8 ~32 Gbps (x16 ~64 Gbps) DDR3 SDRAM ~85 Gbps/ch x 3 ch GDDR5 SDRAM ~192 Gbps/ch x 6 ch QDR InfiniBand ~32 Gbps VITA 57.1 / FMC ~100 Gbps SERDES + 120 Gbps LVDS I/O

PCIe x8

PCIe x8

FMC

QPI QPI

User-definable Frame Synch/Request

Network Adapter

IEEE 1588 PTP Server + Ethernet

SSD

GPU GDDR5 SDRAM

PCIe x8

Page 5: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

REAL-TIME HIGH PERFORMANCE COMPUTER (HPC)

7/20/11 Crossfield Technology LLC 5

Page 6: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Real-Time HPC Requirements

•  Deterministic & Synchronous

–  Synthesized images complete & ready at HWIL frame rate

•  High Floating-Point Performance

–  Implement physics-based algorithms

•  High Bandwidth

–  Inter-processor communications for data exchange

–  Stream high-resolution images to projector at high frame rates

•  High Memory Capacity & Performance

–  Processor memory – code, model parameters, data

–  Non-volatile storage – code, model parameters, data, logging

7/20/11 Crossfield Technology LLC 6

Page 7: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Intel Xeon Processor Roadmap

7/20/11 Crossfield Technology LLC 7

Sandy Bridge Microarchitecture •  32 nm process, 4-8 Cores •  40 lanes PCI Express Gen 3.0 •  4 channels DDR3-1600

Westmere Microarchitecture •  32 nm process, 6 Cores •  40 lanes PCI Express Gen 2.0 •  3 channels DDR3-1333

Page 8: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Nvidia CUDA GPU Roadmap

7/20/11 Crossfield Technology LLC 8

21 SEP 2010 Kepler – To be released sometime in 2011, 28 nm process. Estimated performance of 4-6 DP GFLOPS/W Maxwell – To be released sometime in 2013, 22 nm process. Estimated performance of 15-16 DP GFLOPS/W

Page 9: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Nvidia Tesla (Fermi Architecture)

•  CUDA™ Programming Environment –  C/C++, Fortran, OpenCL, Java, Python or

DirectX Compute

•  GIGATHREAD™ Engine –  515 GFLOP Double Precision

–  1030 GFLOP Single Precision

•  PARALLEL DATACACHE™ Technology –  3 - 6 GB GDDR5 memory

–  384-bit bus

–  ECC option

•  GPUDirect™ with InfiniBand

•  PCI Express 2.0 (16 lanes) –  Two DMA engines for bi-directional data

transfer

7/20/11 9

C2050/C2070

M2050/M2070

Crossfield Technology LLC

Page 10: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Nvidia Tesla Comparison

Tesla C2070 Tesla M2070 Tesla M2090 Peak double precision floating point performance

515 GFLOPS 515 GFLOPS 665 GFLOPS

Peak single precision floating point performance

1030 GFLOPS 1030 GFLOPS 1331 GFLOPS

CUDA cores 448 448 512 Memory size (GDDR5) 6 GB 6 GB 6 GB

Memory bandwidth (ECC off) 144 GB/s 150 GB/s 177 GB/s

Total Dissipated Power (TDP) 247 W 225 W 250 W

Retail price $2300 ~$2300 ~$3500

7/20/11 Crossfield Technology LLC 10

Page 11: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

InfiniBand Roadmap

7/20/11 Crossfield Technology LLC 11

SDR - Single Data Rate DDR - Double Data Rate QDR - Quad Data Rate FDR - Fourteen Data Rate EDR - Enhanced Data Rate HDR - High Data Rate NDR - Next Data Rat

Page 12: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Mellanox ConnectX-2 Network Adapters

•  Nvidia GPUDirect™ –  InfiniBand Adapter and Nvidia

GPU share CPU memory region

•  Open Fabrics Enterprise Distribution (OFED) Software

•  Bandwidth –  10G Ethernet

–  10/20/40G InfiniBand

–  PCIe 2.0 (8-lanes)

•  Performance –  1 µs Ping latency

–  50M MPI messages/s

•  Protocol Support –  Remote Direct Memory

Access (RDMA)

–  OpenMPI, OSU MVAPICH, HPMPI, Intel MPI, MS MPI, Scali MPI

–  TCP/UDP, IPoIB, SDP, RDS – SRP, iSER, NFS RDMA, FCoIB, FCoE

7/20/11 12 Crossfield Technology LLC

Page 13: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Mellanox IS5200 InfiniBand Switch

•  Non-blocking, full bisectional bandwidth

•  100-300 ns latency

•  Up to 216 QSFP ports –  17.28 Tb/s aggregate

throughput

•  9U cabinet –  6 spine modules

–  12 leaf modules

•  1 kW

7/20/11 Crossfield Technology LLC 13

Page 14: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Remote Direct Memory Access (RDMA)

•  Remote Direct Memory Access enables data to be transferred from one processor’s memory to another processor’s memory across a network, without significantly involving either operating system

•  RDMA supports zero-copy data transfers by enabling the network adapter to transfer data directly to or from application memory, eliminating the need to copy data between application memory and data buffers in the operating system kernel

•  RDMA defines READ, WRITE and SEND/RECEIVE

•  RDMA adapters support thousands of concurrent transactions using work queues

7/20/11 Crossfield Technology LLC 14

Page 15: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

7/20/11 Crossfield Technology LLC 15

SA Subnet Administrator

MAD Management Datagram

SMA Subnet Manager Agent

PMA Performance Manager Agent

IPoIB IP over InfiniBand

SDP Sockets Direct Protocol

SRP SCSI RDMA Protocol (Initiator)

iSER iSCSI RDMA Protocol (Initiator)

RDS Reliable Datagram Service

UDAPL User Direct Access Programming Lib

HCA Host Channel Adapter

R-NIC RDMA NIC

Common

InfiniBand

iWARP

Key

InfiniBand HCA iWARP R-NIC

Hardware Specific Driver

Hardware Specific Driver

Connection Manager

MAD

InfiniBand OpenFabrics Kernel Level Verbs / API iWARP R-NIC

SA Client

Connection Manager

Connection Manager Abstraction (CMA)

InfiniBand OpenFabrics User Level Verbs / API iWARP R-NIC

SDP IPoIB SRP iSER RDS

SDP Lib

User Level MAD API

Open SM

Diag Tools

Hardware

Provider

Mid-Layer

Upper Layer Protocol

User APIs

Kernel Space

User Space

NFS-RDMA RPC

Cluster File Sys

Application Level

SMA

Clustered DB Access

Sockets Based Access

Various MPIs

Access to File

Systems

Block Storage Access

IP Based App

Access

Apps & Access Methods for using OF Stack

UDAPL

Ker

nel b

ypas

s

Ker

nel b

ypas

s

OpenFabrics Alliance (OFA) Open Source

Page 16: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

GPU Server Options

•  1U server

–  Dual Xeon 5600 processors & 5520 chipsets

–  Three 16-lane + one 8-lane PCIe slots

–  Supports 1-3 M2090 + 1-2 IB HCA

•  2U server

–  Dual Xeon 5600 processors & 5520 chipsets

–  Four 16-lane + two 8-lane PCIe slots (PLX 8647 switch)

–  Supports 1-4 M2090 + 1-2 IB HCA

•  4U server

–  Dual Xeon 5600 processors & 5520 chipsets

–  Eight 16-lane PCIe slots (4 PLX 8647 switches)

–  Supports 4-7 C2070 + 1-4 IB HCA

7/20/11 Crossfield Technology LLC 16

Page 17: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

HPC System Configuration

•  4U Servers (64 + 1)

–  Dual 6-core, 2.66 GHz Intel Xeon 5650 (Westmere) CPUs

–  Dual Intel 5520 (Tylersburg-36D) IOH with 6.4 GT/s QPI

•  Four 16-lane PCI Express Gen 2 slots

–  Six 8 GB DDR3-1333 DIMMs (48 GB)

–  Four Nvidia Tesla C2070 (Fermi) GPUs

–  One Mellanox 40G InfiniBand Host Channel Adapter

–  One 300 GB, 10K RPM disk drive

•  Mellanox 40G InfiniBand Switch (216 ports max)

•  Symmetricom IEEE 1588 PTP Master Clock

•  APC Smart-UPS RT 6000VA (18) – 76 kW

•  42U Racks (9)

7/20/11 Crossfield Technology LLC 17

*65 nodes x 1.4 kW/node = 91 kW

Page 18: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Advanced HPC System Configuration

•  2U Servers (64 + 1)

–  Dual 6-core, 2.66 GHz Intel Xeon 5650 (Westmere) CPUs

–  Dual Intel 5520 (Tylersburg-36D) IOH with 6.4 GT/s QPI

•  Four 16-lane + two 8-lane PCI Express Gen 2 slots (with switch)

–  Six 8 GB DDR3-1333 DIMMs (48 GB)

–  Three Nvidia Tesla M2090 (Fermi) GPUs

–  Two Mellanox 40G InfiniBand Host Channel Adapters

–  One 250 GB SSD (solid state disk)

•  Mellanox 40G InfiniBand Switch (216 ports max)

•  Symmetricom IEEE 1588 PTP Master Clock

•  APC Symmetra PX SY100K100F UPS - 100 kW

•  42U Racks (4+1)

7/20/11 Crossfield Technology LLC 18

Page 19: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Future HPC System Configuration

•  2U Servers (64 + 1)

–  Dual 8-core, 2.3 GHz Intel Xeon E5-2600 (Sandy Bridge) CPUs

•  Four 16-lane + two 8-lane PCI Express Gen 3 slots (with switch)

–  Eight 8 GB DDR3-1600 DIMMs (64 GB)

–  Three Nvidia Tesla M2090 (Fermi) GPUs

–  Two Mellanox 56G InfiniBand Host Channel Adapters

–  One 250 GB SSD (solid state disk)

•  Mellanox 56G InfiniBand Switch (648 ports max)

•  Symmetricom IEEE 1588 PTP Master Clock

•  APC Symmetra PX SY100K100F UPS - 100 kW

•  42U Racks (4+1)

7/20/11 Crossfield Technology LLC 19

Page 20: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

IEEE 1588 Precision Time Protocol

•  IEEE 1588-2008 Precision Time Protocol (PTP) Version 2 overcomes network and application latency and jitter through hardware time stamping at the physical layer of the network.

•  IEEE 1588-2008 provides time transfer accuracy in the sub ns range, a significant improvement in time synchronization accuracy over Network Time Protocol (NTP).

•  The Symmetricom XLi Grandmaster is IEEE 1588-2008 PTP V2 compliant and time stamps PTP packets with a time stamp accuracy of 50 ns to UTC. Measured synchronization accuracy at a PTP client has been shown to be as good as a 17 ns offset from the XLi Grandmaster. Operating at 100BaseT line speed with deep time stamp packet buffers, the XLi Grandmaster can support thousands of 1588 clients.

7/20/11 20 Crossfield Technology LLC

Page 21: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Uninterruptable Power Supply (UPS)

•  APC Symmetra PX 100kW

•  Scalable to 100kW/100kVA

•  208V 3PH 332A Service

7/20/11 Crossfield Technology LLC 21

Page 22: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

APC Symmetra PX Performance

7/20/11 Crossfield Technology LLC 22

Page 23: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

HPC Performance

Node System Cores – CPU/GPU 12/1536 768/98304 CPU SP FP Performance 128 GFLOP 8 TFLOP CPU DP FP Performance 64 GFLOP 4 TFLOP GPU SP FP Performance 3990 GFLOP 255 TFLOP GPU DP FP Performance 1995 GFLOP 128 TFLOP Main Memory Size 48 GB 3 TB Main Memory BW 64 GB/s 4 TB/s Disk Size 250 GB 16 TB Disk IOPS (4 KB) 20K 1.28M Disk R/W BW 500/315 MB/s 32/20 GB/s Network BW 50 Gb/s 3.2 Tb/s Power 1.5 kW 100 kW

7/20/11 Crossfield Technology LLC 23

Page 24: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

HPC Procurement Schedule

•  Breadboard Performance Evaluation 15 JUL

•  Finalize HPC Configuration 15 JUL

–  # Fermi Processors (4 -> 3)

–  # IB Adapters (1 -> 2)

–  UPS (100 kW), Server (4U -> 2U), SSD

•  Request Final Vendor Quotes 1 AUG

•  HPC Vendor Selection

–  Issue HPC System Purchase Order OCT 31

•  HPC System Integration & Test by Vendor

–  6-12 week delivery ARO

•  Installation DEC 31

–  Prepare electrical supply for UPS

7/20/11 Crossfield Technology LLC 24

Page 25: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

REAL-TIME LINUX

7/20/11 Crossfield Technology LLC 25

Page 26: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Real-Time Operating System (RTOS)

•  Requirements

– No dropped frames during simulation run

– Support Nvidia’s CUDA

– Support InfiniBand Adapter with GPUDirect™

– Support Precision Time Protocol (PTP) IEEE 1588

•  Candidate RTOS’

– Concurrent Computer RedHawk

– RedHat MRG (Messaging, Real-Time, Grid)

7/20/11 Crossfield Technology LLC 26

Page 27: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Interrupt Dispatch Latency*

7/20/11 Crossfield Technology LLC 27

*Ravi Malhotra, “Real-Time Performance on Linux-based Systems,” 2011 Freescale Technology Forum

Page 28: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Real-Time Support on Linux*

•  Traditionally, Linux is not a real-time operating system

– Designed for server throughput performance rather than embedded systems latency

– Scheduling latencies can be unbound

– Big kernel lock and other mechanisms (softIRQ) typically end up blocking real-time critical tasks

– Processes cannot be pre-empted while executing system calls

7/20/11 Crossfield Technology LLC 28

*Ravi Malhotra, “Real-Time Performance on Linux-based Systems,” 2011 Freescale Technology Forum

Page 29: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Sources of Latency & How RT Patch Helps*

7/20/11 Crossfield Technology LLC 29

*Ravi Malhotra, “Real-Time Performance on Linux-based Systems,” 2011 Freescale Technology Forum

Page 30: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

HPC PERFORMANCE MODEL

7/20/11 Crossfield Technology LLC 30

Page 31: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Hyperformix Workbench Performance Model

7/20/11 Crossfield Technology LLC 31

Page 32: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Workbench Model Steps

The application consists of 9 steps that comprise the generation and transfer of a frame:

1.  Projector requests frame (provides state data)

2.  CPU setups Frame Generation Process

3.  CPU writes task data to CPU Memory (DDR3 SDRAM)

4.  CPU tasks the GPU to synthesize the Frame

5.  GPU reads the task data from CPU memory

6.  GPU synthesizes the Frame

7.  GPU transfers the frame data to CPU memory

8.  CPU tasks the InfiniBand Network Adapters to transfer the frame to Crossfield Network

Interface via the InfiniBand Switch

9.  Network Adapters transfer the frame to FPGA memory using RDMA Protocol

7/20/11 Crossfield Technology LLC 32

Page 33: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Hyperformix Workbench Performance Model

7/20/11 Crossfield Technology LLC 33

Page 34: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Workbench Model Results

7/20/11 Crossfield Technology LLC 34

Application Steps Response

(µs)

Application.Step_1_Frame_Request_from_Projector.response 1.151

Application.Step_2_and_3_Setup_Process_and_write_data_to_memory.response 0.1923

Application.Step_4_CPU_tasks_GPU.response 0.1923

Application.Step_5_GPU_reads_data_from_CPU_Memory.response 0.4148

Application.Step_6_GPU_synthesizes_Frame_first_transfer.response 1000

Application.Step_7_GPU_xfers_Frame_to_CPU_memory.response 917.7

Application.Step_8_CPU_tasks_Network_Adapter_to_transfer_Frame_to_NI.response 0.1682

Application.Step_9_Network_Adapter_xfer_frame_to_NI_FPGA_Memory.response 2259

Application.Main_RT_App.All_Steps_transfer_RT_2 4179

Page 35: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

PROJECTOR INTERFACE

7/20/11 Crossfield Technology LLC 35

Page 36: Fermi Cluster for Real-Time Hyperspectral Scene Generation · SDP Sockets Direct Protocol SRP SCSI RDMA Protocol (Initiator) iSER iSCSI RDMA Protocol (Initiator) RDS Reliable Datagram

Projector Interfaces

FPGA Mezzanine Cards (FMC)

1.  Two Dual DVI

2.  Parallel Fiber Optic Ports (8-10)

3.  Digital Micromirror Device (DMD) Interface

– All modules provide 2 User Definable I/Os, e.g.

• HWIL Synchronization Signal

• Output Next Frame

7/20/11 Crossfield Technology LLC 36