GPU-Accelerated Design Optimization on the Cloud

Post on 25-May-2022

4 views 0 download

Transcript of GPU-Accelerated Design Optimization on the Cloud

GPU-Accelerated

Design Optimization

on the Cloud

Krishnan Suresh

Associate Professor

Mechanical Engineering

Design Optimization

2

Reduce weight

subject to constraints

(GE/GrabCAD)

A structure subject to loading

Design Optimization

Domains

3(OptiStruct)

(Generico)

4

Big Players, Big $

� ANSYS

� Abaqus

� Altair/OptiStruct

� Nastran

� SolidWorks

� &

$10 billion investment annually technavio.com

Design Optimization on the Cloud

5

Browser driven design optimization

Design Optimization on the Cloud

6

Client

• No software/hardware investment

• Pay as you go

• Anywhere, anytime

• &

Service Provider

• Easier maintenance

• Larger market

• &

3D-Printing

Democratization of

fabrication

Design Optimization to 3D Printing

8

Democratization of design

Catch?

9

Design Optimization

DesignSpace

Finite Element Analysis(FEA)

Optimal?

ChangeDesign

No

10^5 ~ 10^7 dof

Solve Kd = f

K: Sparse SPD

100’s of iterations!

10

Design Optimization Cost

11

A naïve port

to cloud will not work!

� OptiStruct (commercial code)

� Xeon E5 2697, 92 GB

� 20 hours!

Cloud Based Design Optimization

Fast Limited

Memory FEA

Pareto

Optimization

GPU

Acceleration

WebGL

Fast Limited

Memory FEA

Cloud Based Design Optimization

FEA Bottleneck: Kd = f

DesignSpace

Finite Element Analysis(FEA)

Optimal?

ChangeTopology

No

10^5 ~ 10^7 dof

Solve Kd = f

K: Sparse SPD

100’s of iterations!

14

==

15

Kd = f (GTC)

� Fine-grained Parallel Preconditioners

� CULA

� MAGMA

� Accelerating Iterative Linear Solvers

� Efficient AMG on Hybrid GPU Clusters

� Preconditioning for Large-Scale Linear Solvers

� &

� Exploit mesh congruency

� Exploit physics behavior

� K constantly changing

� &

Design Optimization

Kd = f

17

Exploit Mesh Congruency (GTC 2014)

Kd f=

Model DiscretizeAssemble/

Solve

Post-process

Mesh-aware SpMV Acceleration: Congruence

Element Congruency

18

62350 elements

2780 distinct

95.5% congruent

Observation: Large-meshes contain many similar elements!

Elements are ‘rigid-body/scaling’ congruent

⇒ Identical element stiffness Ke

Only store Ke of distinct elements

Implication: SpMV

19

( )1

Classic: N

e

i

Kd K d=

≡ ∑

: Sparse Matrix-Vector Multiplication (SpMV)

Critical operation in ALL iterative solvers

Kd

( )1

Assembly-free: N

e e

i

Kd K d=

≡∑

Only store Ke of distinct elements + Assembly Free

Experiment

20

106 Elements

1 Distinct element

0

200

400

600

800

1000

Assembled AF-CPU AF-GPU

SpMV; Kd (msec)

770

37

- Same number of FLOPS!

- Reduced memory

CPU

Mesh-aware Kd

Naïve Kd

One Kd (SpMV)

21

Physics Aware Deflation

Kd f=

Model DiscretizeAssemble/

Solve

Post-process

Physics Aware Deflation

0

TK W KW=%

0K K<<%

Agglomeration/Grouping

Treat each group as rigid body

Deflated CG

Kd f=

Example

23

3.15 million DOF

Fast Limited

Memory FEA

Pareto

Optimization

1. Mesh Congruency

2. AF Deflation

Cloud Based Design Optimization

Design Optimization

K Matrix: Constantly changing

� Update K?

� Update deflation?

( )1

Assembly-free: N

e e

i

Kd K d=

≡∑Skip deleted finite elements

SpMV accelerates further

0

TK W KW=%

0 ( )T

eK K W K W= − ∆% %

K K<<%

Example: Design Optimization

� OptiStruct (commercial)

� Xeon E5 2697, 92 GB

� 20 hours!

� Pareto

� I7 4770, 8 GB

� 42 mins

Framework

Fast Limited

Memory FEA

GPU

Acceleration

Pareto

Optimization

Mesh Aware SpMV on GPU

Deflation on GPU

TW dWµProlongation Restriction

Example: Design Optimization

� OptiStruct (commercial)

� Xeon E5 2697,92 GB

� 20 hours!

� Pareto

� I7 4770,8 GB

� 42 mins

� Pareto

� GTX 480,1.5 GB

� 6 mins

Cloud Based Design Optimization

Fast Limited

Memory FEA

Pareto

Optimization

GPU

Acceleration

WebGL

32

WebGL & Three.js

WebGL

� JavaScript API for 3D graphics in browsers

� www.khronos.org

� Almost all browsers

ThreeJs

� Higher-level library

� www.threejs.org

� Almost all browsers

33

Finally …

34

A Pilot Service

www.cloudtopopt.com

� Entry level server

• E3-1270 V3

• 8 GB

� Limited to 150,000 degrees of freedom

� 300+ users

www.cloudtopopt.com

35

www.cloudtopopt.com

36

37

Plans

� Port to HPC provider

� NSF funding

� Launch startup

www.cloudtopopt.com

Acknowledgements

Praveen Yadav Shiguang Deng Amir M. Mirzendehdel Chaman Singh Alireza Taheri Bian Xiang

Anirudh Krishnakumar Anirban Niyogi Victor Cavalcanti Cameron Gilanshah Yibo Hu Alex Buehler

Funding� NSF

� Air-force

� Luvata

� Autodesk

� Sandia National Lab

ksuresh@wisc.edu www.cloudtopopt.com