ANSYS Fluent 16.0 Performance Benchmarking · 3 ANSYS Fluent • Computational Fluid Dynamics (CFD)...
Transcript of ANSYS Fluent 16.0 Performance Benchmarking · 3 ANSYS Fluent • Computational Fluid Dynamics (CFD)...
2
Note
• The following research was performed under the HPC Advisory Council
activities
– Special thanks for: HP Enterprise, Mellanox
• For more information on the supporting vendors solutions please refer to:
– www.mellanox.com, http://www.hp.com/go/hpc
• For more information on the application:
– http://www.ansys.com
3
ANSYS Fluent
• Computational Fluid Dynamics (CFD) is a computational technology
– Enables the study of the dynamics of things that flow
– Enable better understanding of qualitative and quantitative physical phenomena in the flow which is used to
improve engineering design
• CFD brings together a number of different disciplines
– Fluid dynamics, mathematical theory of partial differential systems, computational geometry, numerical
analysis, Computer science
• ANSYS FLUENT is a leading CFD application from ANSYS
– Widely used in almost every industry sector and manufactured product
4
Objectives
• The presented research was done to provide best practices
– Fluent performance benchmarking
• CPU performance comparison
• MPI library performance comparison
• Interconnect performance comparison
• System generations comparison
• The presented results will demonstrate
– The scalability of the compute environment/application
– Considerations for higher productivity and efficiency
5
Test Cluster Configuration
• HP Proliant XL170r Gen9 32-node (1024-core) cluster
– Mellanox ConnectX-4 100Gbps EDR InfiniBand Adapters
– Mellanox Switch-IB SB7700 36-port 100Gb/s EDR InfiniBand Switch
• HP Proliant XL230a Gen9 32-node (1024-core) cluster
– Mellanox Connect-IB FDR 56Gbps FDR InfiniBand Adapters
– Mellanox SwitchX-2 SX6036 36-port 56Gb/s FDR InfiniBand / VPI Ethernet Switch
• Dual-Socket 16-Core Intel E5-2698v3 @ 2.30 GHz CPUs (BIOS: Maximum Performance, Turbo Off)
• Memory: 128GB memory, DDR4 2133 MHz
• OS: RHEL 6.5, MLNX_OFED_LINUX-3.0-1.0.1 InfiniBand SW stack
• MPI: Platform MPI 9.1
• Application: ANSYS Fluent 16.0
• Benchmark datasets: ANSYS Fluent Standard Benchmarks
6
Item HP ProLiant XL230a Gen9 Server
Processor Tw o Intel® Xeon® E5-2600 v3 Series, 6/8/10/12/14/16 Cores
Chipset Intel Xeon E5-2600 v3 series
Memory 512 GB (16 x 32 GB) 16 DIMM slots, DDR3 up to DDR4; R-DIMM/LR-DIMM; 2,133 MHz
Max Memory 512 GB
Internal Storage1 HP Dynamic Smart Array B140i
SATA controller
HP H240 Host Bus Adapter
Netw orkingNetw ork module supporting
various FlexibleLOMs: 1GbE, 10GbE, and/or InfiniBand
Expansion Slots1 Internal PCIe:
1 PCIe x 16 Gen3, half-height
PortsFront: (1) Management, (2) 1GbE, (1) Serial, (1) S.U.V port, (2) PCIe, and Internal Micro SD
card & Active Health
Pow er SuppliesHP 2,400 or 2,650 W Platinum hot-plug pow er supplies
delivered by HP Apollo 6000 Pow er Shelf
Integrated ManagementHP iLO (Firmw are: HP iLO 4)
Option: HP Advanced Power Manager
Additional FeaturesShared Pow er & Cooling and up to 8 nodes per 4U chassis, single GPU support, Fusion I/O
support
Form Factor 10 servers in 5U chassis
HP ProLiant XL230a Gen9 Server
7
Fluent Performance - EDR InfiniBand vs FDR InfiniBand
• InfiniBand delivers superior scalability performance
– EDR InfiniBand provides higher performance and more scalable than other network interconnects
– EDR InfiniBand delivers up to 44% of higher performance at 32 nodes / 1024 MPI processes
– InfiniBand continues to scalable to higher nodes or processes
32 MPI Processes / Node Higher is better
44%
25%
8
• Performance advantage of EDR InfiniBand demonstrated on for all input data tested
Higher is better 32 MPI Processes / Node
Fluent Performance - EDR InfiniBand vs FDR InfiniBand
9
• Performance advantage of EDR InfiniBand demonstrated on for all input data tested
– EDR IB improves over FDR IB by ~20% 32 nodes (1024 cores) and ~14% at 16 nodes on average
Higher is better 32 MPI Processes / Node
Fluent Performance - EDR InfiniBand vs FDR InfiniBand
20%
14%
10
Fluent Performance – Scalability
• EDR InfiniBand delivers higher scalability performance
• Scalability was demonstrated to reach up to 100%+ for the benchmark cases tested
– Tested up to 32 nodes / 1024 MPI processes
Higher is better 32 MPI Processes / Node
11
Fluent Summary
• EDR InfiniBand delivers superior scalability performance
– EDR IB provides higher performance and more scalable than other network interconnects
– EDR IB delivers up to 44% of higher performance at 32 nodes / 1024 MPI processes
• Performance advantage of EDR IB is demonstrated on for all input data tested
– EDR IB outperforms over FDR IB
• by ~20% 32 nodes (1024 cores) and ~14% at 16 nodes on average
• Scalability is demonstrated to reach up to 100%+ for the benchmark cases tested
1212
Thank YouHPC Advisory Council
All trademarks are property of their respective owners. All information is provided “As-Is” without any kind of warranty. The HPC Advisory Council makes no representation to the accuracy and
completeness of the information contained herein. HPC Advisory Council undertakes no duty and assumes no obligation to update or correct any information presented herein