Professional CAE Product Development | GTC...

51
By : Veenus A V, Associate GM & Lead NeST-NVIDIA Center for GPU computing, Trivandrum, India Office: NeST/SFO Technologies, San Jose, CA, www.nestsoftware.com veenusav @ gmail. com

Transcript of Professional CAE Product Development | GTC...

Page 1: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

By : Veenus A V, Associate GM & Lead NeST-NVIDIA Center for GPU computing, Trivandrum, India

Office: NeST/SFO Technologies, San Jose, CA, www.nestsoftware.com

veenusav @ gmail. com

Page 2: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

“ Do not simply believe in anything because

you have heard it.

› No matter that if I

have told it !

Believe only after you observe and

analyze. ”

Reference: Anguttara Nikaya, Vol 1, 188-193

Sri Buddha

Page 3: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Application Architecture policies

› Scientific Visualization

Software blends with the platform

Demands of modern users

Proof of Concept >> Product

Page 4: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

We are showing a few technical

experiments for your understanding.

Not a PRODUCT demonstration!

Page 5: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 6: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Pre Solver Post

Page 7: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

The data structures are plain to process

› May be a few arrays. An under graduate

can understand all these in plain form.

Graphics is not that vast

› Compared to a typical game, it is a simple

deal. Na?!!

A bit serious results – Users will adjust!

Page 8: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

.. Let me explain about our background

before continuing ..

Page 9: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

It will reveal the way how we

are proceeding so…!

Page 10: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 11: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 12: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

We make

specific

software

solutions for

your scientific

needs.

Page 13: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

We are specialized in engineering software development.

NeST-NVIDIA center for GPU computing

› Lab specifically for GPU based technologies

› Inaugurated by Dr. Bill Dally –chief scientist NVIDIA

Page 14: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

How to architect the software for your

futuristic hardware and software..

Proof-of-concept to Product

Not giving emphasis on:

› Features of the applications

› Algorithms

Page 15: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Pre Solver Post

In focus: Scientific data visualization

Page 16: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

`

Multi physics

Solver

Outer

surfaces Volume

(v or thd)

Results

Volume

(tensor)

Shapes

and

geometry Display

Frame

(image) Analysis model

(boundary & other

params)

Page 17: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

`

Multi physics

Solver

Outer

surfaces Volume

(v or thd)

Results

Volume

(tensor)

Known model

(Expert

system db

some cases)

Historical

Experiment

Results

For eg: inverse modeling process

Shapes

and

geometry Display

Frame

(image) Analysis model

(boundary & other

params)

Page 18: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Workstation PC

Page 19: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Multi physics

Solver

Outer

surfaces Volume

tetra

hedron

Known model

(Expert

system db

some cases)

Historical

Experiment

Results

For eg: inverse modeling process

Shapes

and

geometry

PC Display

Frame 1920 x 1080

48.8 KB

591 MB

5.6 GB

154.6GB x 10

5.93 MB

Tablet Display

Frame 1080 X 720

2.22 MB

Results

Volume

2.5 TB 3.2 GB

Page 20: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Workstation PC

Page 21: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

CPU

Cores

CPU

RAM

HDD

SSD

SATA

DDR3 PCI Express

Interface Mother board

Bus

GIGABIT

Ethernet

Interface

GPU

Memory

(Global)

GDDR5

GPU

Cache

(2D)

GPU

Cor

es

GPU

Cor

es

GPU

Cor

es

(Shared Memory)

Fast local

Network Intranet User

(Tablet)

Internet Remote User

(Tablet or

Browser)

Global

Memory

Texture Memory

10GB/s

340 MB/s

12 GB/s

5.3 GB/s

42GB/s

350~550 MB/s

70 ~130 MB/s

Page 22: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Algorithms are good.

Mathematics doing fine for centuries…

› Newton’s laws, Maxwell's equations still hold

good.

Proof of concepts might be the best the

world!

Page 23: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

The data structures are plain to process

› May be a few arrays. An under graduate

can understand all these at plain form.

Graphics is not that vast

› Compared to a typical game, it is a simple

deal. Na?!!

A bit serious results – Users will adjust!

Page 24: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

A popular myth – pci express cannot

give data to monitor.. !

PCIExpress can give good frame rate if

your data is ready in CPU memory

› A lot of points like when you closely watch

the platform facts..

› GPU for FLOPS only…

Page 25: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 26: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 27: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Multi physics

Solver

Outer

surfaces Volume

tetra

hedron

Known model

(Expert

system db

some cases)

Historical

Experiment

Results

For eg: inverse modeling process

Shapes

and

geometry

PC Display

Frame 1920 x 1080

48.8 KB

591 MB

5.6 GB

154.6GB

5.93 MB

Tablet Display

Frame 1080 X 720

2.22 MB

Results

Volume

2.5 TB 3.2 GB

Page 28: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 29: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

GPU means - More FLOPS/$, FLOPS/real-estate.

Use GLSL for graphics (SH 5.0 gives you freedom of mesh quality too!)

CUDA syntax is simple, do data flow analysis for maximum throughput

But don’t forget to juice your CPU too!

Page 30: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Offline processing before graphics viewer › Even letting your user to have a coffee before he

starts to analysis.!

› Extra data - Mind HDD space and transfer rate

Spatially order data › viewer will seek like that.

› Processor wait means DELAY! 2D locality of reference

Make an LoD arrangement › User want response not ‘details’ always!

Page 31: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Maximum parallelism, WARP full, threads > cores

Only compute for the device and screen. › Higher resolution is not always needed.

› User wants responsive software

› Pixel shader is your time eater.. Resolution of RT

› GPU utilized for other compute, do these based on real response metrics. Do 2D bicubic instead.

Page 32: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

› Texelize .. Texelize….

Read-only data, a knowledge that gives

freedom for GPU cache…

› Use asynchronous system at the maximum

Processor is not the only ‘active’ component in

the board!

Use streams of CUDA or switching of textures…

Page 33: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Its time of BOYD

Do watch software systems on specific

platforms

› For googling: Kepler grid, cloudgaming

Page 34: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Volume viewer – voxelized data

Geometric Editor – Mesh can be perfect!

Preparation for solver - inverse modeling

with GPU (only platform work)

Remote visualization for post processor

Page 35: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 36: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Video

Page 37: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 38: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Volume resolution and dimensions

› Avoid empty spaces

› Bricking,

› Compression

Quality Graphics demanded

› Phong SM

Page 39: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Video

Page 40: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 41: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Algorithm based on Laplacian

The operations involved is as follows. Select a ROI in the mesh on the screen

Draw a sketch on the screen suggesting a edited region of mesh

The model will be reshaped to fit the curve but still retaining the shape.

Page 42: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

2D edge tracking to 3D was a challenge

Used modified form of classic algorithms

of CPU.

› In GPU was difficult

› Created regular triangles on the fly to give

neat result

Same area. So isosceles or equilateral

Page 43: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

To make the model, real world data used

Huge data inputs › Point cloud, volumetric, high data rate

Inverse modeling techniques used by preparatory algorithm

SVD to avoid non-significant information

Challenge – partial volume correlation

Page 44: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Volume division optimized for maximum threads in gpu and MPI

Model the control flow (limit) as per the locality heuristics (expert system with direction vectors)

Always handle border separate(good for processor)

Each module may not be that fast..! › Win war.. Not every battle…!

Page 45: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Users demand BOYD

Not all features – but subset

KEPLER GRID most awaiting hardware

Page 46: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 47: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Features › Html 5 client

› Stream based server

› LoD based RayCaster viewer TO Nvidia iRay

› Serviced on a GPU cluster

Challenge › Time-to-market: Conversion of existing engine

› Multi user support and faster data speed

Page 48: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral
Page 49: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Proof-of-concept level complexities - Algorithm level research

Development process – How to manage projects which involves scientific stuff and new platform challenges.

Test automation architecture

Deployment scenarios and hardware tune-up at the final level (it is a fact always!)

Page 50: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Remember Kalama Sutta…

› Your questions may transform my

thinking…

Please ask even after the session [email protected]

www.nestsoftware.com

Page 51: Professional CAE Product Development | GTC 2013on-demand.gputechconf.com/gtc/2013/presentations/S... · 2.5 TB 3.2 GB GPU means - More ... neat result Same area. So isosceles or equilateral

Do write to us on technical and business

queries.

› Speaker: veenusav @ gmail.com

› Website: www.nestsoftware.com

› Business queries: [email protected]