Tesla Brochure 12 Lr

8/6/2019 Tesla Brochure 12 Lr

1/14

NVIDIA

TESLA

GPU COMPUTINGREVOLUTIONIZING HIGH PERFORMANCECOMPUTING

To learn more, go to www.nvidia.com/tesla 2010 NVIDIA, the NVIDIA logo, CUDA, GPUDirect, Parallel Nsight, and Tesla are trademarks and/or registered trademarks o NVIDIA Corporationin the United States and other countries. Other company and product names may be trademarks o the respective companies with which they areassociated. All rights reserved. 10/2010


2/14

GPUS ARE REVOLUTIONIZING COMPUTING

The high perormance computing (HPC) industrys need

computation is increasing, as large and complex computa

problems become commonplace across many industry se

Traditional CPU technology, however, is no longer capabl

in perormance suciently to address this demand.

The parallel processing capability o the Graphics Proces

(GPU) allows it to divide complex computing tasks into th

smaller tasks that can be run concurrently. This ability is

computational scientists and researchers to address som

worlds most challenging computational problems up to s

orders o magnitude aster.


3/14

The use o GPUs or computation is a

dramatic shit in HPC. GPUs deliver

perormance increases o 10x to 100x

to solve problems in minutes instead

o hours, outpacing the perormance

o traditional computing with x86-

based CPUs alone. In addition, GPUs

also deliver greater perormance per

watt o power consumed.

From climate modeling to medical

tomography, NVIDIA Tesla GPUs

are enabling a wide variety o

segments in science and industry to

progress in ways that were previously

impractical, or even impossible, due

to technological limitations.

Conventional CPU computing architecture

is no longer scaling to match the

demands o HPC.

Co-processing reers to

the use o an accelerator,

such as a GPU, to ooad

the CPU and to increase

computational efciency.

10000

1000

100

10

1

1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2016

Performanc

evsVAX

100Perormance Advantage

by 2021

25% year

52% year

20% year

CPU

GPU

Growth per year

NVIDIA TESLAGPUS ARE REVOLUCOMPUTING

WHY GPU COMPUTING?

With the ever-increasing demand

or more computing perormance,

the HPC industry is moving toward a

hybrid computing model, where GPUs

and CPUs work together to perorm

general purpose computing tasks.

As parallel processors, GPUs excel

at tackling large amounts o similar

data because the problem can be split

into hundreds or thousands o pieces

and calculated simultaneously.

As sequential processors, CPUs

are not designed or this type o

computation, but they are adept

at more serial-based tasks such

as running operating systems and

organizing data. NVIDIAs GPU

solutions outpace others as they

apply the most relevant processor to

the specic task in hand.

Tesla GPU computing is de

transormative increases i

or a wide range o HPC ind

segments.

50XMATLAB Computing

AccelerEyes

14Financial

Oxord U

30XGene Sequencing

U o Maryland

3Molecula

U o Illinois, Ur

146XMedical Imaging

U o Utah

20X3D Ultrasound

TechniScan

18XVideo Transcoding

Elemental Technologies

5XDigital Content Creation

Adobe

100XAstrophysics

RIKEN

80XWeather Modeling

Tokyo Institute o Technology

Source: Hennessy & Patterson, CAAQA, 4th Edition.

The convergence o new, ast GPUs optimizedor computation as well as 3D graphicsacceleration and industry-standard sotwaredevelopment tools marks the real beginningo the GPU computing era.

Nathan BrookwoodPrincipal Analyst & Co-Founder, Insight64


4/14

NVIDIA TESLAGPUS ARE REVOLUTCOMPUTING

CUDA PARALLELCOMPUTINGARCHITECTURE

CUDA is NVIDIAs parallel computing

architecture. Applications that

leverage the CUDA architecture can

be developed in a variety o languagesand APIs, including C, C++, Fortran,

OpenCL, and DirectCompute.

The CUDA architecture contains

hundreds o cores capable o running

many thousands o parallel threads,

while the CUDA programming

model lets programmers ocus on

parallelizing their algorithms and not

the mechanics o the language.

The latest generation CUDA

architecture, codenamed Fermi, is

the most advanced GPU computing

architecture ever built. With over

three billion transistors, Fermi is

making GPU and CPU co-processing

pervasive by addressing the ull-spectrum o computing applications.

With support or C++, GPUs based on

the Fermi architecture make parallel

processing easier and accelerate

perormance on a wider array o

applications than ever beore. Just a

ew applications that can experience

signicant perormance benets

include ray tracing, nite element

analysis, high-precision scientic

computing, sparse linear algebra,

sorting, and search algorithms.

PARALLELACCELERATION

Multi-core programming with x86

CPUs is dicult and oten results in

marginal perormance gains when

going rom 1 core to 4 cores to 16

cores. Beyond 4 cores, memory

bandwidth becomes the bottleneck to

urther perormance increases.

To harness the parallel computing

power o GPUs, programmers can

simply modiy the perormance-

critical portions o an application

to take advantage o the hundreds

o parallel cores in the GPU. The

rest o the application remains the

same, making the most ecient use

o all cores in the system. Running

a unction on the GPU involvesrewriting that unction to expose

its parallelism, then adding a ew

new unction-calls to indicate which

unctions will run on the GPU or the

CPU. With these modications, the

perormance-critical portions o the

application can now run signicantly

aster on the GPU.

Core comparison between a

CPU and a GPU.

CPUMultiple Cores

GPUHundreds o Cores

Developers use indust

languages and tools to

massively parallel CUD

Libraries and Middleware

Language Solutions Device level APIs

C C++ FortranJava and

Pythoninterfaces

DirectCompute

OpenCL

NVIDIA GPU CUDA Parallel Computing Architecture

GPU COMPUTING APPLICATIONS

The next generation CUDA

computing architecture, co

Fermi.

History will record Fermi as a signifcantmilestone.

Dave PattersonDirector, Parallel Computing Research Laboratory, U.C. Berkeley

Co-author o Computer Architecture: A Quantitative Approach


5/14

NVIDIA TESLACase Study: Universit

University o Illinois: Accelerated molecularmodeling enables rapid response to H1N1

CHALLENGE A rst step in

mitigating a global pandemic, like

H1N1, requires quickly developing

drugs to eectively treat a virus

that is new and likely to evolve. This

requires a compute-intensive process

to determine how, in the case o

H1N1, mutations o the fu virus

protein could disrupt the binding

pathway o the vaccine Tamifu,

rendering it potentially ineective.

This determination involved a

daunting simulation o a 35,000-

atom system, something a group

o University o Illinois, Urbana-

Champaign scientists, led by John

Stone, decided to tackle in a new way

using GPUs.

Conducting this kind o simulation on

a CPU would take more than a month

to calculate...and that would only

amount to a single simulation, not the

multiple simulations that constitute a

complete study.

SOLUTION Stone and his team

turned to the NVIDIA CUDA parallel

processing architecture running

on Tesla GPUs to perorm their

molecular modeling calculations

and simulate the drug resistance

o H1N1 mutations. Thanks to GPU

technology, the scientists could

eciently run multiple simulations

and achieve potentially lie-saving

results aster.

IMPACT The GPU-accelerated

calculation was completed in just

over an hour. The almost thousand-

old improvement in perormance

available through GPU computingand advanced algorithms empowered

the scientists to perorm emergency

computing to study biological

problems o extreme relevance and

share their results with the medical

research community.

This speed and perormance

increase not only enabled

researchers to ulll their

original goaltesting Tamifus

ecacy in treating H1N1 and its

mutationsbut it also bought

them time to make other importantdiscoveries. Further calculations

showed that genetic mutations which

render the swine or avian fu resistant

to Tamifu had actually disrupted

the binding unnel, providing new

understanding about a undamental

mechanism behind drug resistance.

In the midst o the H1N1 pandemic,

the use o improved algorithmsbased on CUDA and running on Tesla

GPUs made it possible to produce

actionable results about the ecacy

o Tamifu during a single aternoon.

This would have taken weeks or

months o computing to produce

the same results using conventional

approaches.

Ribosome or protein synthesis.

WHO BENEFITS FROM GPU

COMPUTING?

Computational scientists

and researchers who are

using GPUs to a ccelerate

their applications are

seeing results in days

instead o months, even

minutes instead o days.

The benefts o GPU computing can be

replicated in other research areas as well

All o this work is made speedier and mor

efcient thanks to GPU technology, which

us means quicker results as well as dollaand energy saved.

Joh

Sr. Research Prog

University o Illinois at Urbana-Cha


6/14

NVIDIA TESLACase Study: Harvard

Harvard University:Finding Hidden Heart Problems Faster

320 detector-row CT has enabled single

heart beat coronary imaging so that the

entire coronary contrast opacication

can be evaluated at a single time point.

The ull 3D course o the arteries, in turn,

allows researchers to simulate the blood

fowing through it by using computational

fuid fow simulations, and subsequentlycompute the endothelial shear stress.

CHALLENGE Heart attacks, the

leading cause o death worldwide, are

caused when plaque, that has built

up on artery walls, dislodges and

blocks the fow o blood to the heart.Up to 80% o heart attacks are caused

by plaque that is not detectable by

conventional medical imaging. Even

viewing the 20% that is detectable

requires invasive endoscopic

procedures, which involve running

several eet o tubing into the patient

in an eort to take pictures o arterial

plaque.

This level o uncertainty with regard

to the exact location o potentially

deadly plaque poses a signicant

challenge or cardiologists.Historically, it has been a guessing

game or heart specialists to

determine i and where to place

arterial stents in patients with

blockages. Knowing the location o

the plaque could greatly improve

patient care and save lives.

SOLUTION A team o researchers,

including doctors at Harvard Medical

School and Brigham & Womens

Hospital in Boston, Massachusetts,

have discovered a non-invasive

way to nd dangerous plaque in a

patients arteries. Tapping into the

computational power o GPUs, they

can create a highly individualized

model o blood fow within a patient in

a study called hemodynamics.

The buildup o plaque is highly

correlated to the shapeor

geometryo a patients arterial

structure. Bends in an artery tend to

be areas where dangerous plaque is

especially concentrated.

Using imaging devices like a CTscan, scientists are able to create

a model o a patients circulatory

system. From there, an advanced

fuid dynamics simulation o the blood

fow through the patients arteries

can be conducted on a computer to

identiy areas o reduced endothelial

sheer stress on the arterial wall.

A complex simulation like this one

requires billions o fuid elements to

be modeled as they pass through an

artery system. An area o reduced

sheer stress indicates that plaque

has ormed on the interior artery

walls, preventing the bloodstream

rom making contact with the inner

wall. The overall output o the

simulation provides doctors with

an atherosclerotic risk map. The

map provides cardiologists with the

location o hidden plaque and can

serve as an indicator as to where

stents may eventually need to be

placedand all o this knowledge

is gained without invasive imaging

techniques or exploratory surgery.

IMPACT GPUs provide 20x morecomputational power and an order

o magnitude more perormance

per dollar to the application o

image reconstruction and blood

fow simulation, nally making such

advanced simulation techniques

practical at the clinical level.

Without GPUs, the amount o

computing equipmentin terms o

size and expensewould render a

hemodynamics approach unusable.

Because it can detect dangerous

arterial plaque earlier than anyother method, it is expected that

this breakthrough could save

numerous lives when it is approved

or deployment in hospitals and

research centers.


7/14

MotionDSP: The increasing importance o theGPU in the Armed Forces

NVIDIA TESLACase Study: MotionD

MotionDSPs product, Ikena

ISR, leverages NVIDIAs CUDA

parallel computing architecture

allowing it to render, stabilize and

enhance live video aster and more

accurately than its competitors.

Ikena ISR eatures computationally

intense, advanced motion-

tracking algorithms that provide

the basis or sophisticated image

stabilization and super-resolution

video reconstruction. Perhaps most

importantly, it can all be run on o-the-

shel Windows laptops and servers.

Using NVIDIA Tesla GPUs, MotionDSPs

customers, which include a variety o

military-unded research groups, are

making UAVs saer and more reliable

while reducing deployment costs,

improving simulation accuracy and

dramatically boosting perormance.

IMPACT Using only CPUs to execute

the kind o sophisticated video post-

processing algorithms required or

eective reconnaissance would result in

up to six hours o processing or each one

hour o videonot a viable solution when

real-time results are critical. In contrast,

Tesla GPUs enable MotionDSPs Ikena

ISR sotware to process any live video

source in real-time with less than 200ms

o latency. Moreover, instead o requiringexpensive CPU-clustered computing

systems to complete the work, Ikena can

perorm at ull capacity on a standard

workstation small enough to t inside

military vehicles.

Merlin International is one o the astest

growing providers o inormation

technology solutions in the United States;

their Collaborative Video Delivery oering

which includes Ikena helps support

the deense and intelligence missions o

the US Federal Government.

MotionDSPs use o GPU technologyhas greatly enhanced the capabilities

o its Ikena sotware, enabling it to

deliver real-time super-resolution

analysis o intelligence video

something that simply was not possible

beore, said John Trauth, President

o Merlin International. Integrating

this technology into our Collaborative

Video Delivery solution can enable our

government customers to quickly and

easily access the data they need or

eective intelligence, surveillance and

reconnaissance (ISR) this saves lives

and signicantly increases mission

success rates.

Super-resolution algorithms allow

MotionDSP to reconstruct video with

better and cleaner detail, increased

resolution and reduced noise.

With the GPU, were bringing higher

quality video to all ISR platorms, includin

smaller UAVs, by being smarter about how

we utilize COTS PC technology.

Our technology makes the impossible

possible, and this is making our military

saer and better prepared.Sea

Chie Executive Ofcer, Mo

CHALLENGE Unmanned Aerial

Vehicles (UAVs) represent the latest

in high-tech weaponry deployed

to strengthen and improve the

militarys capabilities. But with new

technologies come new challenges,

such as capturing actionableintelligence while fying at speeds

upwards o 140 mph, 10 miles above

the earth.

One key eature o the UAV is that it

is capable o providing a real-time

stream o detailed images taken

with multiple cameras on the vehicle

simultaneously. The challenge is

that the resulting images need to be

rendered, stabilized and enhanced in

real-time and across vast distances in

order to be useul.

Once they have been processed,

the images can give inantry

critical inormation about potential

challenges aheadthe end goal being

to ensure the saety and protection o

military personnel in the eld.

Using CPUs alone, this process

is very time consuming and does

not allow inormation to be viewed

in real-time. As a result, military

action could be based on potentially

outdated intelligence data and

inaccurate guides.

SOLUTION MotionDSP, a sotware

company based in San Mateo,

Caliornia, has developed super-

resolution algorithms that allow it

to reconstruct video with better and

cleaner detail, increased resolution

and reduced noise. All o which are

ideal or the live streaming o video

rom the cameras attached to a UAV.


8/14

Bloomberg: GPUs increase accuracy and reduceprocessing time or bond pricing

Financial engineering is integral totodays buying and selling decisions.

CHALLENGE Getting a mortgage

and buying a home is a complex

nancial transaction, and or

lenders, the competitive pricing and

management o that mortgage is aneven greater challenge. Transactions

involving thousands o mortgages

at once are a routine occurrence in

nancial markets, spurred by banks

that wish to sell o loans to get their

money back sooner.

Known as collateralized debt

obligations (CDO) and collateralized

mortgage obligations (CMO), baskets

o thousands o loans are publicly

traded nancial instruments. For thebanks and institutional investors who

buy and sell these baskets, timely

pricing updates are essential because

o ast-changing market conditions.

Bloomberg, one o the worlds leading

nancial services organizations, prices

CDO/CMO baskets or its customers

by running powerul algorithms that

model the risks and determine the price.

NVIDIA TESLACase Study: Bloombe

This technique requires calculating

huge amounts o data, rom interest

rate volatility to the payment behavior

o individual borrowers. These data-

intensive calculations can take hours

to run with a CPU-based computing

grid. Time is money, and Bloomberg

wanted a new solution that would allow

them to get pricing updates to their

customers aster.

SOLUTION Bloomberg implemented

an NVIDIA Tesla GPU computing

solution in their datacenter. By

porting their application to run on

the NVIDIA CUDA parallel processing

architecture to harness the power o

GPUs, Bloomberg received dramatic

improvements across the board. Large

calculations that had previously taken

up to two hours can now be completed

in two minutes. Smaller runs that

had taken 20 minutes can now be

perormed in just seconds.

In addition, the capital outlay or the

new GPU-based solution was one-

tenth the cost o an upgraded CPU

solution, and urther savings are being

realized due to the GPUs ecient

power and cooling needs.

IMPACT As Bloomberg customers

make CDO/CMO buying and selling

decisions, they now have access to

the best and most current pricing

inormation, giving them a serious

competitive trading advantage in a

market where timing is everything.

One o the challenges Bloomberg always aces is that we

very large scale. Were serving all the fnancial and businecommunity and there are a lot o dierent instruments an

models people want calculated.

ShawCTO, B

GPU ACCELERATI

Large calculations

had previously tak

to two hours can n

completed in two

Smaller runs that

taken 20 minutes

now be perormed

seconds.


9/14

NVIDIA TESLACase Study: A

A: Accelerating 3D seismic interpretation

CHALLENGE In the search or oil

and gas, the geological inormation

provided by seismic images o the

earth is vital. By interpreting the data

produced by seismic imaging surveys,geoscientists can identiy the likely

presence o hydrocarbon reserves and

understand how to extract resources

most eectively. Today, sophisticated

visualization systems and

computational analysis tools are used

to streamline what was previously a

subjective and labor intensive process.

Today, geoscientists must process

increasing amounts o data as

dwindling reserves require them

to pinpoint smaller, more complex

reservoirs with greater speed andaccuracy.

SOLUTION UK-based company A

provides world leading 3D seismic

analysis sotware and services to the

global oil and gas industry. Its sotware

tools extract detailed inormation rom3D seismic data, providing a greater

understanding o complex 3D geology,

improving productivity and reducing

uncertainty within the interpretation

process. The sophisticated tools are

compute-intensive so it can take

hours, or even days, to produce results

on conventional high perormance

workstations.

With the recent release o its

CUDA enabled 3D seismic analysis

application, A users routinely achieve

over an order o magnitude speed-upcompared with perormance on high

end multi-core CPUs.

The latest benchmark

results using Tesla GPUs

have produced perormance

improvements o up to 37x

versus high-end workstations

with two quad core CPUs.

40

30

20

10

0

Speedups

Quad-Core

CPU

Tesla

GPU + CPU

Data courtesy o RMOTC

CUDA-based GPUs power large computa

tasks and interactive computational work

that we could not hope to implement

eectively otherwise.Stev

A Technical

This step change in perormance

signicantly increases the amount o

data that geoscientists can analyze in a

given timerame. Plus, it allows them

to ully exploit the inormation derived

rom 3D seismic surveys to improve

subsurace understanding and reduce

risk in oil and gas exploration and

exploitation.

IMPACT NVIDIA CUDA is allowing A

to provide scalable high perormance

computation or seismic data on

hardware platorms equipped with

one or more NVIDIA Tesla and Quadro

GPUs. The latest benchmark results

using Tesla GPUs have produced

perormance improvements o up to

37x versus high-end workstations with

two quad core CPUs.

Access to high perormance, high

quality 3D computational tools on

a workstation platorm drastically

improves the productivity curve in

3D seismic analysis and seismic

interpretation, giving our users a real

edge in oil and gas exploration and de-

risking eld development.


10/14


Approximately every 10 years, the world o supercomputing experiences

a undamental shit in computing architectures. It was around 10

years ago when cluster-based computing largely superceded vector-

based computing as the de acto standard or large-scale computing

installations, and this shit saw the supercomputing industry move

beyond the petaFLOP perormance barrier. With the next perormance

target being exascale-class computing, it is time or a new shit incomputing architecturesthe move to parallel computing.

The shit is already underway.

THE SHIFT TO PARALLEL COMPUTINGAND THE PATH TO ExASCALE


11/14

NVIDIA TESLATHE SHIFT TO PARACOMPUTING AND THEXASCALE

Nebulae, powered by 464

20-series GPUs, is one o t

supercomputers in the wor

In November 2008, Tokyo Institute

o Technology became the rst

supercomputing center to enter

the Top500 with a GPU-based

hybrid systema system that

uses GPUs and CPUs together to

deliver transormative increases

in perormance without breaking

the bank with regards to energy

consumption. The system, called

TSUBAME 1.2, entered the list at

number 24.

Fast orward to June 2010 andhybrid systems have started to make

appearances even higher up the list.

Nebulae, a system installed at the

Shenzhen Supercomputing Center

in China, equipped with 4640 Tesla

20-series GPUs, made its entry into

the list at number 2, just one spot

behind Oak Ridge National Labs

Jaguar, the astest supercomputer in

the world.

What is even more impressive than

the overall perormance o Nebulae,

is how little power it consumes. WhileJaguar delivers 1.77 petaFLOPs, it

consumes more than 7 megawatts o

power to do so.

To put that into context, 7 megawatts

is enough energy to power 15,000

homes. In contrast, Nebulae delivers

1.27 petaFLOPs, yet it does this

within a power budget o just 2.55

megawatts. That makes it twice

as power-ecient as Jaguar. This

dierence in computational throughput

and power is owed to the massively

parallel architecture o GPUs, where

hundreds o cores work together

within a single processor, delivering

unprecedented compute density or

next generation supercomputers.

Another very notable entry into

this years Top500 was the Chinese

Academy o Sciences (CAS). The Mole

8.5 supercomputer at CAS uses 2200

Tesla 20-series GPUs to deliver 207teraFLOPS, which puts it at number 19

in the Top500.

Future computing architectures will be hybridsystems with parallel-core GPUs working intandem with multi-core CPUs.

Jack DongarraDistinguished Proessor at University o Tennessee

CAS is one o the worlds most dynamic

research and educational acilities.

Pro. Wei Ge, Proessor o Chemical

Engineering at the Institute o Process

Engineering at CAS, introduced GPU

computing to the Beijing acility in 2007

to help them with discrete particle and

molecular dynamics simulations. Since

then, parallel computing has enabled

the advancement o research in dozens

o other areas, including: real-time

simulations o industrial acilities,

the design and optimization o multi-

phase and turbulent industrial reactors

using computational fuid dynamics,

the optimization o secondary and

tertiary oil recovery using multi-scale

simulation o porous materials, the

simulation o nano- and micro-fow in

chemical and bio-chemical processes,

and much more.

These computational problems

represent a tiny raction o the entire

landscape o computational challenges

that we ace today, and these problems

are not getting any smaller. The sheer

quantity o data that many scientists,

engineers and researchers must

analyze is increasing exponentially and

supercomputing centers are over-

subscribed as demand is outpacing

the supply o computational resources.

I we are to maintain our rate o

innovation and discovery, we must take

computational perormance to a level

where it is 1000 times aster than what

it is today. The GPU is a transormative

orce in supercomputing and

represents the only viable strategy to

successully build exascale systems

that are aordable to build and ecient

to operate.

Five years rom now, the bulk o

serious HPC is going to be done

with some kind o accelerated

heterogeneous architecture.

said Steve Scott, CTO o Cray Inc.

TSUBAME 1.2, by Tokyo Institute o

Technology, was the rst Tesla GPU based

hybrid cluster to appear on the Top500 list.


12/14


NVIDIA TESLAWORLDS FIRST COMPUTATIONAL GPU

Tesla 20-series GPU computing solutions are designed rom the ground up or high-

perormance computing and are based on NVIDIAs latest CUDA GPU architecture,

code named Fermi. It delivers many must have eatures or HPC including ECC

memory or uncompromised accuracy and scalability, C++ support, and 7x the double

precision perormance o CPUs. Compared to typical quad-core CPUs, Tesla 20-series

GPU computing products can deliver equivalent perormance at 1/10th the cost and

1/20th the power consumption.


13/14

TESLA GPU COMPUTING SOLUTIONS

NVIDIA Tesla products are designed or high-perormance computing,

and oers exclusive computing eatures.

Superior Perormance

> Highest double precision foating point perormance

> Large HPC data sets supported by larger on-board memory

> Faster communication with InniBand using NVIDIA GPUDirect

Highly Reliable

> ECC protection or uncompromised data reliability

> Stress tested or zero error tolerance

> Manuactured by NVIDIA

> Enterprise-level support that includes a three-year warranty

Designed or HPC

> Integrated by leading OEMs into workstations, servers and blades

NVIDIA TESLAGPU COMPUTING SO

TESLA DATA CENTER PRODUCTS

Available rom OEMs and certied resellers, Tesla GPU computing products

are designed to supercharge your computing cluster.

Tesla M2050/M2070 GPU Computing

Module enables the use o GPUs and

CPUs together in an individual server

node or blade orm actor.

Tesla S2050 GPU Computing System

is a 1U system powered by 4 Tesla

GPUs and connects to a CPU server.

750

600

450

300

150

0

70

60

50

40

30

20

10

0

800

600

400

200

0

8x

5x

4x

PerformanceGflops

CPU Server GPU-CPU

Server

CPU Server GPU-CPU

Server

CPU Server GPU-CPU

Server

Performance/$Gflops/$K

Performance/wattGflops/kwatt

Highest Perormance, Highest Efciency

CPU 1U Server: 2x Intel Xeon X5550 (Nehalem) 2.66 GHz, 48 GB memory, $7K, 0.55 kwGPU-CPU 1U Server: 2x Tesla C2050 + 2x Intel Xeon X5550, 48 GB memory, $11K, 1.0 kw

GPU-CPU server solutions

8x higher Linpack perorm

Tesla C2050/C2070 GPU Computing

Processor delivers the power o

a cluster in the orm actor o a

workstation.

TESLA WORkSTATION PRODUCTS

Designed to deliver cluster-level perormance on a workstation, the NVIDIA Tesla

GPU Computing Processors uel the transition to parallel computing while making

personal supercomputing possibleright at your desk.

Highest Perormance, Highest Efciency

Workstations powered by T

outperorm conventional C

solutions in lie science ap

120

100

80

60

40

20

0

MIDG: DiscontinuousGalerkin Solvers forPDEs

Speedups

8

7

6

5

4

3

2

1

0

8

7

6

5

4

3

2

1

0

180

160

140

120

100

80

40

20

0

8

7

6

5

4

3

2

1

0

AMBER MolecularDynamics (MixedPrecision)

FFT: cuFFT 3.1 inDouble Precision

OpenEye ROCSVirtual DrugScreening

Radix SortCUDA SDK

Intel Xeon X5550 CPU

Tesla C2050


14/14

NVIDIA TESLAGPU COMPUTING EC

DEVELOPER ECOSYSTEM AND WORLDWIDE EDUCATION

In just a ew years, an entire sotware ecosystem has developed around the CUDA

architecturerom more than 350 universities worldwide teaching the CUDA

programming model, to a wide range o libraries, compilers and middleware that

help users optimize applications or GPUs.

The NVIDIA GPU Computing Ecosystem

A rich ecosystem o sotware

applications, libraries, programming

language solutions, and service providers

support the CUDA parallel computing

architecture.

CUDA

NVIDIAs CUDA architecture has the industrys most robust language and API

support or GPU computing developers, including C, C++, OpenCL, DirectCompute,

and Fortran. NVIDIAParallel Nsight, a ully integrated development

environment or Microsot Visual Studio is also available. Used by more than six

million developers worldwide, Visual Studio is one o the worlds most popular

development environments or Windows-based applications and services. Adding

unctionality specically or GPU computing developers, Parallel Nsight makes the

power o the GPU more accessible than ever beore.

In addition to the CUDA C development tools, math libraries, and hundreds o code samples

in the NVIDIA GPU computing SDK, there is also a rich ecosystem o solutions:

Libraries and Middleware Solutions

> Acceleware FDTD libraries

> CUBLAS, complete BLAS library*

> CUFFT, high-perormance FFTroutines*

> CUSP

> EM Photonics CULA Tools,heterogeneous LAPACKimplementation

> NVIDIA OptiX and other AXEs*

> NVIDIA Perormance Primitives orimage and video processingwww.nvidia.com/npp*

> Thrust

Compilers and Language Solutions

> CAPS HMPP

> NVIDIA CUDA C Compiler (NVCC),supporting both CUDA C and CUDAC++*

> Par4All

> PGI CUDA Fortran

> PGI Accelerator Compilers orC and Fortran

> PyCUDA

GPU Debugging Tools

> Allinea DDT

> Fixstars Eclipse plug-in

> NVIDIA cuda-gdb*

> NVIDIA Parallel Nsightor Visual Studio

> TotalView Debugger

GPU Perormance Analysis Tools

> NVIDIA Visual Proler*

> NVIDIA Parallel Nsightor Visual Studio

> TAU CUDA

> Vampir

Cluster and Grid ManagementSolutions

> Bright Cluster Manager

> NVIDIA system managementinterace (nvidia-smi)

> Platorm Computing

Math Pacages> Jacket by AccelerEyes

> Mathematica 8 by Wolram

> MATLAB Distributed ComputingServer (MDCS) by Mathworks

> MATLAB Parallel ComputingToolbox (PCT) by Mathworks

Consulting and Training

Consulting and training services areavailable to support you in portingapplications and learning aboutdeveloping with CUDA.

For more inormation, visitwww.nvidia.com/object/cuda_consultants.html.

Education and Certifcation

> CUDA Certicationwww.nvidia.com/certication

> CUDA Center o Excellenceresearch.nvidia.com

> CUDA Research Centersresearch.nvidia.com/

> CUDA Teaching Centersresearch.nvidia.com/

> CUDA and GPU computing books

* Available with the latest CUDA toolkit at www.nvidia.com/getcuda

For more inormation about the CUDA Certication Program,

visit www.nvidia.com/certifcation.

INTEGRATEDDEVELOPMENTENVIRONMENT

RESEARCH &EDUCATION LIBRARIES

TOOLS& PARTNERS

ALL MAJORPLATFORMS

MATHEMATICALPACKAGES

LANGUAGES& APIS

CONSULTANTS,TRAINING, &CERTIFICATION

NVIDIA Parallel Nsight

sotware is the industrys

rst development

environment or massively

parallel computing

integrated into Microsot

Visual Studio. It integrates

CPU and GPU development,

allowing developers to

create optimal GPU-

accelerated applications.

Tesla Brochure 12 Lr

Documents

Transcript of Tesla Brochure 12 Lr