M
Marco_Netto
Over the last few weeks we have been working together with Altair engineers to verify and validate their nanoFluidX v2024 product on Azure. This software offers significant advantages for engineers tackling problems where traditional CFD technology requires significant manual time and heavy computational resources. Vehicle wading is an important durability attribute where engineers monitor water reach and accumulation, and assess the potential for damage caused by water impact.
nanoFluidX's Lagrangian meshless approach was designed from inception for GPU compute using NVIDIA CUDA, making it one of the fastest SPH solvers on the market. Models can be setup incredibly quickly, giving engineers the power to iterate faster.
With this validation, the intention was to look at the GPU compute possibilities in two ways: how will nanoFluidX perform on the Nvidia H100 series GPUs, and how it will work while scaling up to 8-way GPU virtual machines (VMs). Let’s look at the A100 and the H100 first.
The NC_A100_v4 has 3 flavors, with 1, 2 or 4 A100 80GB GPUs. In the basis, these are PCIe based GPUs, but internally they are NVlink connected in pairs. The rest of the system consists of 24 (non-multithreaded) AMD Milan CPU cores, 220GB main memory, and a 960TB NVME local scratch disk per GPU. When selecting a 2 or 4 GPU VMs, these numbers are multiplied up to a total of 880GB main memory.
The NC_H100_v5 has grown together with the GPU capabilities. It is available in a 1 or 2 GPU configuration built around the Nvidia 94GB H100 NVL. While this GPU has a PCIe interface towards the main system, many of the capabilities are in line with the SXM H100 series. The CPU cores are increased to 40 (non-multi-threaded) AMD Genoa CPU cores and 320 GB of main memory together with an upgraded 3.5TB NVME local scratch disk.
The benchmark that was run for this validation is the Altair CX-1 car model. This benchmark represents a production-scale model of a full size vehicle traveling at 10 km/h through a 24 meter wading channel in 15 seconds.
"Collaborating with Microsoft and NVIDIA, we have successfully validated nanoFluidX v2024 on NVIDIA’s A100 and H100 GPUs. The latest release boasts a solver that is 1.5x faster than previously, and offers improved scaling on multiple GPUs. These benchmarks show the use of NVIDIA H100 significantly enhances performance by up to 1.8x, cutting simulation times and accelerating design cycles. These advancements solidify nanoFluidX as one of the fastest Smoothed-particle Hydrodynamic (SPH) GPU code on the market." - David Curry, Senior Vice President, CFD and EDEM, Altair.
As can been seen in the table below, the H100 delivers higher performance than the A100, which is in line with the published performance increase between the two generations by Nvidia. Therefore, both the software and the Azure VMs allow these GPUs to reach their compute potential.
Since nanoFluidX supports multi-GPU systems, we wanted to validate the scalability and test them on the 8-way GPU ND series. Again, we tested on both the Nvidia A100 systems, the NDads_A100_v4, and its successor based on the Nvidia H100: the NDisr_H100_v5. Both of these systems have all 8 GPUs interconnected through NVlink.
As shown in the table above, nanoFluidX effectively utilizes all GPU power. On the NDisr_H100_v5, it achieved a 1-hour simulation duration, significantly impacting turnaround and design cycle time.
While you can simply go to the portal, request a quota, and spin up these VMs, we often see customers seeking an HPC environment that integrates better into their workflow for production. Altair offers a solution to run projects on Azure through their Altair One platform. Please collaborate with your Altair representative to enable this Azure-based solution for you. Alternatively, you can use the Altair SaaS solution, Altair Unlimited, from the virtual appliance marketplace to deploy and manage your own HPC cluster on Azure. To enable GPU quotas for HPC, please coordinate with your Azure account manager.
#AzureHPCAI
Continue reading...
nanoFluidX's Lagrangian meshless approach was designed from inception for GPU compute using NVIDIA CUDA, making it one of the fastest SPH solvers on the market. Models can be setup incredibly quickly, giving engineers the power to iterate faster.
With this validation, the intention was to look at the GPU compute possibilities in two ways: how will nanoFluidX perform on the Nvidia H100 series GPUs, and how it will work while scaling up to 8-way GPU virtual machines (VMs). Let’s look at the A100 and the H100 first.
The NC_A100_v4 has 3 flavors, with 1, 2 or 4 A100 80GB GPUs. In the basis, these are PCIe based GPUs, but internally they are NVlink connected in pairs. The rest of the system consists of 24 (non-multithreaded) AMD Milan CPU cores, 220GB main memory, and a 960TB NVME local scratch disk per GPU. When selecting a 2 or 4 GPU VMs, these numbers are multiplied up to a total of 880GB main memory.
The NC_H100_v5 has grown together with the GPU capabilities. It is available in a 1 or 2 GPU configuration built around the Nvidia 94GB H100 NVL. While this GPU has a PCIe interface towards the main system, many of the capabilities are in line with the SXM H100 series. The CPU cores are increased to 40 (non-multi-threaded) AMD Genoa CPU cores and 320 GB of main memory together with an upgraded 3.5TB NVME local scratch disk.
The benchmark that was run for this validation is the Altair CX-1 car model. This benchmark represents a production-scale model of a full size vehicle traveling at 10 km/h through a 24 meter wading channel in 15 seconds.
"Collaborating with Microsoft and NVIDIA, we have successfully validated nanoFluidX v2024 on NVIDIA’s A100 and H100 GPUs. The latest release boasts a solver that is 1.5x faster than previously, and offers improved scaling on multiple GPUs. These benchmarks show the use of NVIDIA H100 significantly enhances performance by up to 1.8x, cutting simulation times and accelerating design cycles. These advancements solidify nanoFluidX as one of the fastest Smoothed-particle Hydrodynamic (SPH) GPU code on the market." - David Curry, Senior Vice President, CFD and EDEM, Altair.
As can been seen in the table below, the H100 delivers higher performance than the A100, which is in line with the published performance increase between the two generations by Nvidia. Therefore, both the software and the Azure VMs allow these GPUs to reach their compute potential.
Since nanoFluidX supports multi-GPU systems, we wanted to validate the scalability and test them on the 8-way GPU ND series. Again, we tested on both the Nvidia A100 systems, the NDads_A100_v4, and its successor based on the Nvidia H100: the NDisr_H100_v5. Both of these systems have all 8 GPUs interconnected through NVlink.
As shown in the table above, nanoFluidX effectively utilizes all GPU power. On the NDisr_H100_v5, it achieved a 1-hour simulation duration, significantly impacting turnaround and design cycle time.
While you can simply go to the portal, request a quota, and spin up these VMs, we often see customers seeking an HPC environment that integrates better into their workflow for production. Altair offers a solution to run projects on Azure through their Altair One platform. Please collaborate with your Altair representative to enable this Azure-based solution for you. Alternatively, you can use the Altair SaaS solution, Altair Unlimited, from the virtual appliance marketplace to deploy and manage your own HPC cluster on Azure. To enable GPU quotas for HPC, please coordinate with your Azure account manager.
#AzureHPCAI
Continue reading...