Tesla Brochure 12 Lr
-
Upload
karthikraman -
Category
Documents
-
view
222 -
download
0
Transcript of Tesla Brochure 12 Lr
-
8/6/2019 Tesla Brochure 12 Lr
1/14
NVIDIA
TESLA
GPU COMPUTINGREVOLUTIONIZING HIGH PERFORMANCECOMPUTING
To learn more, go to www.nvidia.com/tesla 2010 NVIDIA, the NVIDIA logo, CUDA, GPUDirect, Parallel Nsight, and Tesla are trademarks and/or registered trademarks o NVIDIA Corporationin the United States and other countries. Other company and product names may be trademarks o the respective companies with which they areassociated. All rights reserved. 10/2010
-
8/6/2019 Tesla Brochure 12 Lr
2/14
GPUS ARE REVOLUTIONIZING COMPUTING
The high perormance computing (HPC) industrys need
computation is increasing, as large and complex computa
problems become commonplace across many industry se
Traditional CPU technology, however, is no longer capabl
in perormance suciently to address this demand.
The parallel processing capability o the Graphics Proces
(GPU) allows it to divide complex computing tasks into th
smaller tasks that can be run concurrently. This ability is
computational scientists and researchers to address som
worlds most challenging computational problems up to s
orders o magnitude aster.
-
8/6/2019 Tesla Brochure 12 Lr
3/14
The use o GPUs or computation is a
dramatic shit in HPC. GPUs deliver
perormance increases o 10x to 100x
to solve problems in minutes instead
o hours, outpacing the perormance
o traditional computing with x86-
based CPUs alone. In addition, GPUs
also deliver greater perormance per
watt o power consumed.
From climate modeling to medical
tomography, NVIDIA Tesla GPUs
are enabling a wide variety o
segments in science and industry to
progress in ways that were previously
impractical, or even impossible, due
to technological limitations.
Conventional CPU computing architecture
is no longer scaling to match the
demands o HPC.
Co-processing reers to
the use o an accelerator,
such as a GPU, to ooad
the CPU and to increase
computational efciency.
10000
1000
100
10
1
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2006 2016
Performanc
evsVAX
100Perormance Advantage
by 2021
25% year
52% year
20% year
CPU
GPU
Growth per year
NVIDIA TESLAGPUS ARE REVOLUCOMPUTING
WHY GPU COMPUTING?
With the ever-increasing demand
or more computing perormance,
the HPC industry is moving toward a
hybrid computing model, where GPUs
and CPUs work together to perorm
general purpose computing tasks.
As parallel processors, GPUs excel
at tackling large amounts o similar
data because the problem can be split
into hundreds or thousands o pieces
and calculated simultaneously.
As sequential processors, CPUs
are not designed or this type o
computation, but they are adept
at more serial-based tasks such
as running operating systems and
organizing data. NVIDIAs GPU
solutions outpace others as they
apply the most relevant processor to
the specic task in hand.
Tesla GPU computing is de
transormative increases i
or a wide range o HPC ind
segments.
50XMATLAB Computing
AccelerEyes
14Financial
Oxord U
30XGene Sequencing
U o Maryland
3Molecula
U o Illinois, Ur
146XMedical Imaging
U o Utah
20X3D Ultrasound
TechniScan
18XVideo Transcoding
Elemental Technologies
5XDigital Content Creation
Adobe
100XAstrophysics
RIKEN
80XWeather Modeling
Tokyo Institute o Technology
Source: Hennessy & Patterson, CAAQA, 4th Edition.
The convergence o new, ast GPUs optimizedor computation as well as 3D graphicsacceleration and industry-standard sotwaredevelopment tools marks the real beginningo the GPU computing era.
Nathan BrookwoodPrincipal Analyst & Co-Founder, Insight64
-
8/6/2019 Tesla Brochure 12 Lr
4/14
NVIDIA TESLAGPUS ARE REVOLUTCOMPUTING
CUDA PARALLELCOMPUTINGARCHITECTURE
CUDA is NVIDIAs parallel computing
architecture. Applications that
leverage the CUDA architecture can
be developed in a variety o languagesand APIs, including C, C++, Fortran,
OpenCL, and DirectCompute.
The CUDA architecture contains
hundreds o cores capable o running
many thousands o parallel threads,
while the CUDA programming
model lets programmers ocus on
parallelizing their algorithms and not
the mechanics o the language.
The latest generation CUDA
architecture, codenamed Fermi, is
the most advanced GPU computing
architecture ever built. With over
three billion transistors, Fermi is
making GPU and CPU co-processing
pervasive by addressing the ull-spectrum o computing applications.
With support or C++, GPUs based on
the Fermi architecture make parallel
processing easier and accelerate
perormance on a wider array o
applications than ever beore. Just a
ew applications that can experience
signicant perormance benets
include ray tracing, nite element
analysis, high-precision scientic
computing, sparse linear algebra,
sorting, and search algorithms.
PARALLELACCELERATION
Multi-core programming with x86
CPUs is dicult and oten results in
marginal perormance gains when
going rom 1 core to 4 cores to 16
cores. Beyond 4 cores, memory
bandwidth becomes the bottleneck to
urther perormance increases.
To harness the parallel computing
power o GPUs, programmers can
simply modiy the perormance-
critical portions o an application
to take advantage o the hundreds
o parallel cores in the GPU. The
rest o the application remains the
same, making the most ecient use
o all cores in the system. Running
a unction on the GPU involvesrewriting that unction to expose
its parallelism, then adding a ew
new unction-calls to indicate which
unctions will run on the GPU or the
CPU. With these modications, the
perormance-critical portions o the
application can now run signicantly
aster on the GPU.
Core comparison between a
CPU and a GPU.
CPUMultiple Cores
GPUHundreds o Cores
Developers use indust
languages and tools to
massively parallel CUD
Libraries and Middleware
Language Solutions Device level APIs
C C++ FortranJava and
Pythoninterfaces
DirectCompute
OpenCL
NVIDIA GPU CUDA Parallel Computing Architecture
GPU COMPUTING APPLICATIONS
The next generation CUDA
computing architecture, co
Fermi.
History will record Fermi as a signifcantmilestone.
Dave PattersonDirector, Parallel Computing Research Laboratory, U.C. Berkeley
Co-author o Computer Architecture: A Quantitative Approach
-
8/6/2019 Tesla Brochure 12 Lr
5/14
NVIDIA TESLACase Study: Universit
University o Illinois: Accelerated molecularmodeling enables rapid response to H1N1
CHALLENGE A rst step in
mitigating a global pandemic, like
H1N1, requires quickly developing
drugs to eectively treat a virus
that is new and likely to evolve. This
requires a compute-intensive process
to determine how, in the case o
H1N1, mutations o the fu virus
protein could disrupt the binding
pathway o the vaccine Tamifu,
rendering it potentially ineective.
This determination involved a
daunting simulation o a 35,000-
atom system, something a group
o University o Illinois, Urbana-
Champaign scientists, led by John
Stone, decided to tackle in a new way
using GPUs.
Conducting this kind o simulation on
a CPU would take more than a month
to calculate...and that would only
amount to a single simulation, not the
multiple simulations that constitute a
complete study.
SOLUTION Stone and his team
turned to the NVIDIA CUDA parallel
processing architecture running
on Tesla GPUs to perorm their
molecular modeling calculations
and simulate the drug resistance
o H1N1 mutations. Thanks to GPU
technology, the scientists could
eciently run multiple simulations
and achieve potentially lie-saving
results aster.
IMPACT The GPU-accelerated
calculation was completed in just
over an hour. The almost thousand-
old improvement in perormance
available through GPU computingand advanced algorithms empowered
the scientists to perorm emergency
computing to study biological
problems o extreme relevance and
share their results with the medical
research community.
This speed and perormance
increase not only enabled
researchers to ulll their
original goaltesting Tamifus
ecacy in treating H1N1 and its
mutationsbut it also bought
them time to make other importantdiscoveries. Further calculations
showed that genetic mutations which
render the swine or avian fu resistant
to Tamifu had actually disrupted
the binding unnel, providing new
understanding about a undamental
mechanism behind drug resistance.
In the midst o the H1N1 pandemic,
the use o improved algorithmsbased on CUDA and running on Tesla
GPUs made it possible to produce
actionable results about the ecacy
o Tamifu during a single aternoon.
This would have taken weeks or
months o computing to produce
the same results using conventional
approaches.
Ribosome or protein synthesis.
WHO BENEFITS FROM GPU
COMPUTING?
Computational scientists
and researchers who are
using GPUs to a ccelerate
their applications are
seeing results in days
instead o months, even
minutes instead o days.
The benefts o GPU computing can be
replicated in other research areas as well
All o this work is made speedier and mor
efcient thanks to GPU technology, which
us means quicker results as well as dollaand energy saved.
Joh
Sr. Research Prog
University o Illinois at Urbana-Cha
-
8/6/2019 Tesla Brochure 12 Lr
6/14
NVIDIA TESLACase Study: Harvard
Harvard University:Finding Hidden Heart Problems Faster
320 detector-row CT has enabled single
heart beat coronary imaging so that the
entire coronary contrast opacication
can be evaluated at a single time point.
The ull 3D course o the arteries, in turn,
allows researchers to simulate the blood
fowing through it by using computational
fuid fow simulations, and subsequentlycompute the endothelial shear stress.
CHALLENGE Heart attacks, the
leading cause o death worldwide, are
caused when plaque, that has built
up on artery walls, dislodges and
blocks the fow o blood to the heart.Up to 80% o heart attacks are caused
by plaque that is not detectable by
conventional medical imaging. Even
viewing the 20% that is detectable
requires invasive endoscopic
procedures, which involve running
several eet o tubing into the patient
in an eort to take pictures o arterial
plaque.
This level o uncertainty with regard
to the exact location o potentially
deadly plaque poses a signicant
challenge or cardiologists.Historically, it has been a guessing
game or heart specialists to
determine i and where to place
arterial stents in patients with
blockages. Knowing the location o
the plaque could greatly improve
patient care and save lives.
SOLUTION A team o researchers,
including doctors at Harvard Medical
School and Brigham & Womens
Hospital in Boston, Massachusetts,
have discovered a non-invasive
way to nd dangerous plaque in a
patients arteries. Tapping into the
computational power o GPUs, they
can create a highly individualized
model o blood fow within a patient in
a study called hemodynamics.
The buildup o plaque is highly
correlated to the shapeor
geometryo a patients arterial
structure. Bends in an artery tend to
be areas where dangerous plaque is
especially concentrated.
Using imaging devices like a CTscan, scientists are able to create
a model o a patients circulatory
system. From there, an advanced
fuid dynamics simulation o the blood
fow through the patients arteries
can be conducted on a computer to
identiy areas o reduced endothelial
sheer stress on the arterial wall.
A complex simulation like this one
requires billions o fuid elements to
be modeled as they pass through an
artery system. An area o reduced
sheer stress indicates that plaque
has ormed on the interior artery
walls, preventing the bloodstream
rom making contact with the inner
wall. The overall output o the
simulation provides doctors with
an atherosclerotic risk map. The
map provides cardiologists with the
location o hidden plaque and can
serve as an indicator as to where
stents may eventually need to be
placedand all o this knowledge
is gained without invasive imaging
techniques or exploratory surgery.
IMPACT GPUs provide 20x morecomputational power and an order
o magnitude more perormance
per dollar to the application o
image reconstruction and blood
fow simulation, nally making such
advanced simulation techniques
practical at the clinical level.
Without GPUs, the amount o
computing equipmentin terms o
size and expensewould render a
hemodynamics approach unusable.
Because it can detect dangerous
arterial plaque earlier than anyother method, it is expected that
this breakthrough could save
numerous lives when it is approved
or deployment in hospitals and
research centers.
-
8/6/2019 Tesla Brochure 12 Lr
7/14
MotionDSP: The increasing importance o theGPU in the Armed Forces
NVIDIA TESLACase Study: MotionD
MotionDSPs product, Ikena
ISR, leverages NVIDIAs CUDA
parallel computing architecture
allowing it to render, stabilize and
enhance live video aster and more
accurately than its competitors.
Ikena ISR eatures computationally
intense, advanced motion-
tracking algorithms that provide
the basis or sophisticated image
stabilization and super-resolution
video reconstruction. Perhaps most
importantly, it can all be run on o-the-
shel Windows laptops and servers.
Using NVIDIA Tesla GPUs, MotionDSPs
customers, which include a variety o
military-unded research groups, are
making UAVs saer and more reliable
while reducing deployment costs,
improving simulation accuracy and
dramatically boosting perormance.
IMPACT Using only CPUs to execute
the kind o sophisticated video post-
processing algorithms required or
eective reconnaissance would result in
up to six hours o processing or each one
hour o videonot a viable solution when
real-time results are critical. In contrast,
Tesla GPUs enable MotionDSPs Ikena
ISR sotware to process any live video
source in real-time with less than 200ms
o latency. Moreover, instead o requiringexpensive CPU-clustered computing
systems to complete the work, Ikena can
perorm at ull capacity on a standard
workstation small enough to t inside
military vehicles.
Merlin International is one o the astest
growing providers o inormation
technology solutions in the United States;
their Collaborative Video Delivery oering
which includes Ikena helps support
the deense and intelligence missions o
the US Federal Government.
MotionDSPs use o GPU technologyhas greatly enhanced the capabilities
o its Ikena sotware, enabling it to
deliver real-time super-resolution
analysis o intelligence video
something that simply was not possible
beore, said John Trauth, President
o Merlin International. Integrating
this technology into our Collaborative
Video Delivery solution can enable our
government customers to quickly and
easily access the data they need or
eective intelligence, surveillance and
reconnaissance (ISR) this saves lives
and signicantly increases mission
success rates.
Super-resolution algorithms allow
MotionDSP to reconstruct video with
better and cleaner detail, increased
resolution and reduced noise.
With the GPU, were bringing higher
quality video to all ISR platorms, includin
smaller UAVs, by being smarter about how
we utilize COTS PC technology.
Our technology makes the impossible
possible, and this is making our military
saer and better prepared.Sea
Chie Executive Ofcer, Mo
CHALLENGE Unmanned Aerial
Vehicles (UAVs) represent the latest
in high-tech weaponry deployed
to strengthen and improve the
militarys capabilities. But with new
technologies come new challenges,
such as capturing actionableintelligence while fying at speeds
upwards o 140 mph, 10 miles above
the earth.
One key eature o the UAV is that it
is capable o providing a real-time
stream o detailed images taken
with multiple cameras on the vehicle
simultaneously. The challenge is
that the resulting images need to be
rendered, stabilized and enhanced in
real-time and across vast distances in
order to be useul.
Once they have been processed,
the images can give inantry
critical inormation about potential
challenges aheadthe end goal being
to ensure the saety and protection o
military personnel in the eld.
Using CPUs alone, this process
is very time consuming and does
not allow inormation to be viewed
in real-time. As a result, military
action could be based on potentially
outdated intelligence data and
inaccurate guides.
SOLUTION MotionDSP, a sotware
company based in San Mateo,
Caliornia, has developed super-
resolution algorithms that allow it
to reconstruct video with better and
cleaner detail, increased resolution
and reduced noise. All o which are
ideal or the live streaming o video
rom the cameras attached to a UAV.
-
8/6/2019 Tesla Brochure 12 Lr
8/14
Bloomberg: GPUs increase accuracy and reduceprocessing time or bond pricing
Financial engineering is integral totodays buying and selling decisions.
CHALLENGE Getting a mortgage
and buying a home is a complex
nancial transaction, and or
lenders, the competitive pricing and
management o that mortgage is aneven greater challenge. Transactions
involving thousands o mortgages
at once are a routine occurrence in
nancial markets, spurred by banks
that wish to sell o loans to get their
money back sooner.
Known as collateralized debt
obligations (CDO) and collateralized
mortgage obligations (CMO), baskets
o thousands o loans are publicly
traded nancial instruments. For thebanks and institutional investors who
buy and sell these baskets, timely
pricing updates are essential because
o ast-changing market conditions.
Bloomberg, one o the worlds leading
nancial services organizations, prices
CDO/CMO baskets or its customers
by running powerul algorithms that
model the risks and determine the price.
NVIDIA TESLACase Study: Bloombe
This technique requires calculating
huge amounts o data, rom interest
rate volatility to the payment behavior
o individual borrowers. These data-
intensive calculations can take hours
to run with a CPU-based computing
grid. Time is money, and Bloomberg
wanted a new solution that would allow
them to get pricing updates to their
customers aster.
SOLUTION Bloomberg implemented
an NVIDIA Tesla GPU computing
solution in their datacenter. By
porting their application to run on
the NVIDIA CUDA parallel processing
architecture to harness the power o
GPUs, Bloomberg received dramatic
improvements across the board. Large
calculations that had previously taken
up to two hours can now be completed
in two minutes. Smaller runs that
had taken 20 minutes can now be
perormed in just seconds.
In addition, the capital outlay or the
new GPU-based solution was one-
tenth the cost o an upgraded CPU
solution, and urther savings are being
realized due to the GPUs ecient
power and cooling needs.
IMPACT As Bloomberg customers
make CDO/CMO buying and selling
decisions, they now have access to
the best and most current pricing
inormation, giving them a serious
competitive trading advantage in a
market where timing is everything.
One o the challenges Bloomberg always aces is that we
very large scale. Were serving all the fnancial and businecommunity and there are a lot o dierent instruments an
models people want calculated.
ShawCTO, B
GPU ACCELERATI
Large calculations
had previously tak
to two hours can n
completed in two
Smaller runs that
taken 20 minutes
now be perormed
seconds.
-
8/6/2019 Tesla Brochure 12 Lr
9/14
NVIDIA TESLACase Study: A
A: Accelerating 3D seismic interpretation
CHALLENGE In the search or oil
and gas, the geological inormation
provided by seismic images o the
earth is vital. By interpreting the data
produced by seismic imaging surveys,geoscientists can identiy the likely
presence o hydrocarbon reserves and
understand how to extract resources
most eectively. Today, sophisticated
visualization systems and
computational analysis tools are used
to streamline what was previously a
subjective and labor intensive process.
Today, geoscientists must process
increasing amounts o data as
dwindling reserves require them
to pinpoint smaller, more complex
reservoirs with greater speed andaccuracy.
SOLUTION UK-based company A
provides world leading 3D seismic
analysis sotware and services to the
global oil and gas industry. Its sotware
tools extract detailed inormation rom3D seismic data, providing a greater
understanding o complex 3D geology,
improving productivity and reducing
uncertainty within the interpretation
process. The sophisticated tools are
compute-intensive so it can take
hours, or even days, to produce results
on conventional high perormance
workstations.
With the recent release o its
CUDA enabled 3D seismic analysis
application, A users routinely achieve
over an order o magnitude speed-upcompared with perormance on high
end multi-core CPUs.
The latest benchmark
results using Tesla GPUs
have produced perormance
improvements o up to 37x
versus high-end workstations
with two quad core CPUs.
40
30
20
10
0
Speedups
Quad-Core
CPU
Tesla
GPU + CPU
Data courtesy o RMOTC
CUDA-based GPUs power large computa
tasks and interactive computational work
that we could not hope to implement
eectively otherwise.Stev
A Technical
This step change in perormance
signicantly increases the amount o
data that geoscientists can analyze in a
given timerame. Plus, it allows them
to ully exploit the inormation derived
rom 3D seismic surveys to improve
subsurace understanding and reduce
risk in oil and gas exploration and
exploitation.
IMPACT NVIDIA CUDA is allowing A
to provide scalable high perormance
computation or seismic data on
hardware platorms equipped with
one or more NVIDIA Tesla and Quadro
GPUs. The latest benchmark results
using Tesla GPUs have produced
perormance improvements o up to
37x versus high-end workstations with
two quad core CPUs.
Access to high perormance, high
quality 3D computational tools on
a workstation platorm drastically
improves the productivity curve in
3D seismic analysis and seismic
interpretation, giving our users a real
edge in oil and gas exploration and de-
risking eld development.
-
8/6/2019 Tesla Brochure 12 Lr
10/14
NVIDIA TESLAGPUS ARE REVOLUTCOMPUTING
Approximately every 10 years, the world o supercomputing experiences
a undamental shit in computing architectures. It was around 10
years ago when cluster-based computing largely superceded vector-
based computing as the de acto standard or large-scale computing
installations, and this shit saw the supercomputing industry move
beyond the petaFLOP perormance barrier. With the next perormance
target being exascale-class computing, it is time or a new shit incomputing architecturesthe move to parallel computing.
The shit is already underway.
THE SHIFT TO PARALLEL COMPUTINGAND THE PATH TO ExASCALE
-
8/6/2019 Tesla Brochure 12 Lr
11/14
NVIDIA TESLATHE SHIFT TO PARACOMPUTING AND THEXASCALE
Nebulae, powered by 464
20-series GPUs, is one o t
supercomputers in the wor
In November 2008, Tokyo Institute
o Technology became the rst
supercomputing center to enter
the Top500 with a GPU-based
hybrid systema system that
uses GPUs and CPUs together to
deliver transormative increases
in perormance without breaking
the bank with regards to energy
consumption. The system, called
TSUBAME 1.2, entered the list at
number 24.
Fast orward to June 2010 andhybrid systems have started to make
appearances even higher up the list.
Nebulae, a system installed at the
Shenzhen Supercomputing Center
in China, equipped with 4640 Tesla
20-series GPUs, made its entry into
the list at number 2, just one spot
behind Oak Ridge National Labs
Jaguar, the astest supercomputer in
the world.
What is even more impressive than
the overall perormance o Nebulae,
is how little power it consumes. WhileJaguar delivers 1.77 petaFLOPs, it
consumes more than 7 megawatts o
power to do so.
To put that into context, 7 megawatts
is enough energy to power 15,000
homes. In contrast, Nebulae delivers
1.27 petaFLOPs, yet it does this
within a power budget o just 2.55
megawatts. That makes it twice
as power-ecient as Jaguar. This
dierence in computational throughput
and power is owed to the massively
parallel architecture o GPUs, where
hundreds o cores work together
within a single processor, delivering
unprecedented compute density or
next generation supercomputers.
Another very notable entry into
this years Top500 was the Chinese
Academy o Sciences (CAS). The Mole
8.5 supercomputer at CAS uses 2200
Tesla 20-series GPUs to deliver 207teraFLOPS, which puts it at number 19
in the Top500.
Future computing architectures will be hybridsystems with parallel-core GPUs working intandem with multi-core CPUs.
Jack DongarraDistinguished Proessor at University o Tennessee
CAS is one o the worlds most dynamic
research and educational acilities.
Pro. Wei Ge, Proessor o Chemical
Engineering at the Institute o Process
Engineering at CAS, introduced GPU
computing to the Beijing acility in 2007
to help them with discrete particle and
molecular dynamics simulations. Since
then, parallel computing has enabled
the advancement o research in dozens
o other areas, including: real-time
simulations o industrial acilities,
the design and optimization o multi-
phase and turbulent industrial reactors
using computational fuid dynamics,
the optimization o secondary and
tertiary oil recovery using multi-scale
simulation o porous materials, the
simulation o nano- and micro-fow in
chemical and bio-chemical processes,
and much more.
These computational problems
represent a tiny raction o the entire
landscape o computational challenges
that we ace today, and these problems
are not getting any smaller. The sheer
quantity o data that many scientists,
engineers and researchers must
analyze is increasing exponentially and
supercomputing centers are over-
subscribed as demand is outpacing
the supply o computational resources.
I we are to maintain our rate o
innovation and discovery, we must take
computational perormance to a level
where it is 1000 times aster than what
it is today. The GPU is a transormative
orce in supercomputing and
represents the only viable strategy to
successully build exascale systems
that are aordable to build and ecient
to operate.
Five years rom now, the bulk o
serious HPC is going to be done
with some kind o accelerated
heterogeneous architecture.
said Steve Scott, CTO o Cray Inc.
TSUBAME 1.2, by Tokyo Institute o
Technology, was the rst Tesla GPU based
hybrid cluster to appear on the Top500 list.
-
8/6/2019 Tesla Brochure 12 Lr
12/14
NVIDIA TESLAGPUS ARE REVOLUTCOMPUTING
NVIDIA TESLAWORLDS FIRST COMPUTATIONAL GPU
Tesla 20-series GPU computing solutions are designed rom the ground up or high-
perormance computing and are based on NVIDIAs latest CUDA GPU architecture,
code named Fermi. It delivers many must have eatures or HPC including ECC
memory or uncompromised accuracy and scalability, C++ support, and 7x the double
precision perormance o CPUs. Compared to typical quad-core CPUs, Tesla 20-series
GPU computing products can deliver equivalent perormance at 1/10th the cost and
1/20th the power consumption.
-
8/6/2019 Tesla Brochure 12 Lr
13/14
TESLA GPU COMPUTING SOLUTIONS
NVIDIA Tesla products are designed or high-perormance computing,
and oers exclusive computing eatures.
Superior Perormance
> Highest double precision foating point perormance
> Large HPC data sets supported by larger on-board memory
> Faster communication with InniBand using NVIDIA GPUDirect
Highly Reliable
> ECC protection or uncompromised data reliability
> Stress tested or zero error tolerance
> Manuactured by NVIDIA
> Enterprise-level support that includes a three-year warranty
Designed or HPC
> Integrated by leading OEMs into workstations, servers and blades
NVIDIA TESLAGPU COMPUTING SO
TESLA DATA CENTER PRODUCTS
Available rom OEMs and certied resellers, Tesla GPU computing products
are designed to supercharge your computing cluster.
Tesla M2050/M2070 GPU Computing
Module enables the use o GPUs and
CPUs together in an individual server
node or blade orm actor.
Tesla S2050 GPU Computing System
is a 1U system powered by 4 Tesla
GPUs and connects to a CPU server.
750
600
450
300
150
0
70
60
50
40
30
20
10
0
800
600
400
200
0
8x
5x
4x
PerformanceGflops
CPU Server GPU-CPU
Server
CPU Server GPU-CPU
Server
CPU Server GPU-CPU
Server
Performance/$Gflops/$K
Performance/wattGflops/kwatt
Highest Perormance, Highest Efciency
CPU 1U Server: 2x Intel Xeon X5550 (Nehalem) 2.66 GHz, 48 GB memory, $7K, 0.55 kwGPU-CPU 1U Server: 2x Tesla C2050 + 2x Intel Xeon X5550, 48 GB memory, $11K, 1.0 kw
GPU-CPU server solutions
8x higher Linpack perorm
Tesla C2050/C2070 GPU Computing
Processor delivers the power o
a cluster in the orm actor o a
workstation.
TESLA WORkSTATION PRODUCTS
Designed to deliver cluster-level perormance on a workstation, the NVIDIA Tesla
GPU Computing Processors uel the transition to parallel computing while making
personal supercomputing possibleright at your desk.
Highest Perormance, Highest Efciency
Workstations powered by T
outperorm conventional C
solutions in lie science ap
120
100
80
60
40
20
0
MIDG: DiscontinuousGalerkin Solvers forPDEs
Speedups
8
7
6
5
4
3
2
1
0
8
7
6
5
4
3
2
1
0
180
160
140
120
100
80
40
20
0
8
7
6
5
4
3
2
1
0
AMBER MolecularDynamics (MixedPrecision)
FFT: cuFFT 3.1 inDouble Precision
OpenEye ROCSVirtual DrugScreening
Radix SortCUDA SDK
Intel Xeon X5550 CPU
Tesla C2050
-
8/6/2019 Tesla Brochure 12 Lr
14/14
NVIDIA TESLAGPU COMPUTING EC
DEVELOPER ECOSYSTEM AND WORLDWIDE EDUCATION
In just a ew years, an entire sotware ecosystem has developed around the CUDA
architecturerom more than 350 universities worldwide teaching the CUDA
programming model, to a wide range o libraries, compilers and middleware that
help users optimize applications or GPUs.
The NVIDIA GPU Computing Ecosystem
A rich ecosystem o sotware
applications, libraries, programming
language solutions, and service providers
support the CUDA parallel computing
architecture.
CUDA
NVIDIAs CUDA architecture has the industrys most robust language and API
support or GPU computing developers, including C, C++, OpenCL, DirectCompute,
and Fortran. NVIDIAParallel Nsight, a ully integrated development
environment or Microsot Visual Studio is also available. Used by more than six
million developers worldwide, Visual Studio is one o the worlds most popular
development environments or Windows-based applications and services. Adding
unctionality specically or GPU computing developers, Parallel Nsight makes the
power o the GPU more accessible than ever beore.
In addition to the CUDA C development tools, math libraries, and hundreds o code samples
in the NVIDIA GPU computing SDK, there is also a rich ecosystem o solutions:
Libraries and Middleware Solutions
> Acceleware FDTD libraries
> CUBLAS, complete BLAS library*
> CUFFT, high-perormance FFTroutines*
> CUSP
> EM Photonics CULA Tools,heterogeneous LAPACKimplementation
> NVIDIA OptiX and other AXEs*
> NVIDIA Perormance Primitives orimage and video processingwww.nvidia.com/npp*
> Thrust
Compilers and Language Solutions
> CAPS HMPP
> NVIDIA CUDA C Compiler (NVCC),supporting both CUDA C and CUDAC++*
> Par4All
> PGI CUDA Fortran
> PGI Accelerator Compilers orC and Fortran
> PyCUDA
GPU Debugging Tools
> Allinea DDT
> Fixstars Eclipse plug-in
> NVIDIA cuda-gdb*
> NVIDIA Parallel Nsightor Visual Studio
> TotalView Debugger
GPU Perormance Analysis Tools
> NVIDIA Visual Proler*
> NVIDIA Parallel Nsightor Visual Studio
> TAU CUDA
> Vampir
Cluster and Grid ManagementSolutions
> Bright Cluster Manager
> NVIDIA system managementinterace (nvidia-smi)
> Platorm Computing
Math Pacages> Jacket by AccelerEyes
> Mathematica 8 by Wolram
> MATLAB Distributed ComputingServer (MDCS) by Mathworks
> MATLAB Parallel ComputingToolbox (PCT) by Mathworks
Consulting and Training
Consulting and training services areavailable to support you in portingapplications and learning aboutdeveloping with CUDA.
For more inormation, visitwww.nvidia.com/object/cuda_consultants.html.
Education and Certifcation
> CUDA Certicationwww.nvidia.com/certication
> CUDA Center o Excellenceresearch.nvidia.com
> CUDA Research Centersresearch.nvidia.com/
> CUDA Teaching Centersresearch.nvidia.com/
> CUDA and GPU computing books
* Available with the latest CUDA toolkit at www.nvidia.com/getcuda
For more inormation about the CUDA Certication Program,
visit www.nvidia.com/certifcation.
INTEGRATEDDEVELOPMENTENVIRONMENT
RESEARCH &EDUCATION LIBRARIES
TOOLS& PARTNERS
ALL MAJORPLATFORMS
MATHEMATICALPACKAGES
LANGUAGES& APIS
CONSULTANTS,TRAINING, &CERTIFICATION
NVIDIA Parallel Nsight
sotware is the industrys
rst development
environment or massively
parallel computing
integrated into Microsot
Visual Studio. It integrates
CPU and GPU development,
allowing developers to
create optimal GPU-
accelerated applications.