NGads V620 series: Optimized for gaming scenarios - General availability and benchmarks

  • Thread starter Thread starter MSFTisgonzalez
  • Start date Start date
M

MSFTisgonzalez

The Azure family of visualization GPU VMs has gained a new member. Many customers have used the VMs in our existing portfolio for a variety of visualization workloads as well as gaming scenarios. The feedback we have received consistently is that workloads with intense graphic generation and streaming require a different type of VM configuration than what has been available. Secondly, as time has passed, customers want to take advantage of newer GPU architectures available from our GPU technology partners.



Optimized for gaming scenarios

With these clear guidelines in mind, we are excited to announce the general availability of our new NGads V620 series virtual machines (VMs). This VM series has GPU, CPU, and memory resources balanced to generate and stream high quality graphics for a performant, interactive gaming experience hosted on Microsoft Azure. The new NGads instances give online gaming providers the power and stability that they need, at an affordable price.

The VMs also feature the AMD Software: Cloud Edition, which targets the same optimizations available in the consumer gaming version of the Adrenaline driver but is further tested and optimized for the cloud environment. AMD Software: Cloud Edition will have regular updates to support the latest titles introduced by gaming partners.



… And highly capable for VDI and rendering workloads

The NGads V620 series with AMD Software: Cloud Edition includes support for accelerated virtual desktop environments, with Radeon PRO optimizations to support high-end workstation applications.

The NGads V620 series GPU-enabled virtual machines are powered by AMD Radeon PRO V620 GPU and AMD EPYC 7763 CPUs. The AMD Radeon PRO V620 GPUs have a maximum frame buffer of 32GB which can be divided up to 4 ways through hardware partitioning, or by providing multiple users with access to shared, session-based operating systems such as Windows Server 2022 or Windows 11 EMS. The AMD EPYC CPUs have a base clock speed of 2.45GHz and a boost speed of 3.5Ghzhttps://techcommunity.microsoft.com/plugins/custom/microsoft/o365/#_edn1. VMs are assigned full cores instead of threads, enabling full access to AMD’s powerful “Zen 3” cores. The NGads V620 series also features NVMe drives standard with each VM size, with up to 1025 GB of temp storage for very fast local data access.

NGads instances come in four sizes, allowing customers to right-size their gaming environments for the performance and cost that best fits their business needs. The two smallest instances rely on industry-standard SR-IOV technology to partition the GPUs into ¼ and ½ instances, enabling customers to run workloads with no interference or security concerns between users sharing the same physical graphics card.

The NGads series also features NVMe drives standard with each VM size, with up to 1025 GB of temp storage for very fast local data access.




Instance Configs

vCPU (Physical Cores)

GPU Memory

(GiB)


GPU Partition Size



Memory

(GiB)


Temp Storage (SSD GiB)

Azure Network (Gbps)

Standard_NG8ads_V620_v1

8

8

¼ GPU

16

256

10

Standard_NG16ads_V620_v1

16

16

½ GPU

32

512

20

Standard_NG32ads_V620_v1

32

32

1x GPU

64

1024

40

Standard_NG32adms_V620_v1

32

32

1x GPU

176

1024

40



Initial benchmarks across gaming scenarios, graphics applications and rendering

The purpose of this blog is to demonstrate initial NGads V620 performance and benchmarking results to help you pick the right VM for your workload.



Raw benchmarks

The first set of results are for the UL 3Dmark TimeSpy test suite, which measures graphics rendering and ray-marched performance. The second set is for UL 3Mark Port Royal, which focuses on GPU performance for ray tracing graphics.

The higher the aggregate score, the better the system performance. The test suite version is 2.23.7457.





large?v=v2&px=999.pngNGads V620 TimeSpy benchmark



Key points:

  • The Timespy score for the NG32ads instance tops out at 16576. The NG16ads and NG8ads instances offer very good performance and in general better than linear performance from its next largest VM size.
  • The Port Royal score shows a similar pattern and very good results that take advantage of the ray tracing capability of the GPU.

large?v=v2&px=999.pngNGads V620 Port Royal benchmark





In-Game Benchmarks

Gears of War 5 has been known as one of the games with the most all-around challenging graphics. We have chosen the 1080p Ultra settings using Parsec as the remoting protocol. The metric is FPS; higher is better.

large?v=v2&px=999.pngNGads V620 Gears 5 Ultra benchmark



Key points:

  • The average framerate for the game generated by the NG8ads instance with 8GB frame buffer is 40.5 fps rising to 140.4 fps for the NG32ads instance.
  • Notice that the “high memory” (NG32adms) instance with additional RAM does not exhibit an increase in results, which is expected for this workload in which the bulk of the graphics workload is being processed within the GPU and not in RAM.



Graphics applications

This set of results is relevant for users of CAD, AEC and other graphics heavy applications. SpecViewPerf is a global standard benchmark that tests the 3D graphics performance of systems running under OpenGL and DirectX by running a number of “viewsets”, each corresponding to a different workstation-level application that represents actual workloads for a variety of industries.

SpecViewPerf scores are the frame rate at which the GPU renders the scenes of a particular viewset. Higher is better for these results.

Windows 11 Pro, 22H2

VDI Protocol: RDP, 1080p

Azure Region location: West Europe



large?v=v2&px=999.pngNGads V620 SpecViewPerf 2020 Viewset Scores



Key points:

  • The results for each viewset show that the NG series has increased performance across all application types. From here you can choose which viewset matters the most for your users.
  • The amount of GPU memory allocated to each VM makes a huge difference to these kinds of workloads. The graph below consolidates the measurements into an overall average (geomean) for each VM size across viewsets. The results show that the results are better as the amount of GPU memory allocated to the VM increases. This is consistent for these graphics-heavy workloads.
  • As in previous graphs, there is very little difference between the NG32ads VM and the NG32adms VM, which also has 32GB GPU memory but more 176GB RAM, indicating that for this workload of this size the graphics are being processed in the GPU and not in system RAM. However, we would expect larger models to require more RAM to have a smooth user experience. Hence it is important to take the results as guidance and run a proof of concept in your environment.

large?v=v2&px=999.pngNGads V620 SpecViewPerf 2020 benchmark geomean



Rendering

The balance and relatively larger GPU memory available with the NG series make these VMs suitable for rendering workloads typically used by designers for rendering workstations.

Below are the results for the Classroom, Junkshop and Monster Under the Bed models. The test resolution was set at 1920 x 1080 resolution and 25fps. The metric is number of samples per minute. Higher is better.



large?v=v2&px=999.pngNGads V620 Blender benchmark



Under the hood: The Test Environment

The NGads V620 employs GPU Partitioning (GPU-P), a technology that allows for a single GPU node to have multiple users on it at once. GPU-P is based on single-root I/O virtualization (SR-IOV) technology which allows sharing of I/O devices and allows for single root function to appear as multiple physical devices. It makes use of virtual functions, which map hardware resources needed to each child partition. Then when the child partition is accessed, the virtual device driver often is able to access the hardware directly, without having to communicate with the host.

The benchmarks featured in this blog were made possible thanks to the open-source benchmarking solution, Virtual Client, developed by Microsoft. Virtual Client enables us to efficiently benchmark our systems at scale, using the latest software and firmware. This helps us closely replicate real-world user conditions while reducing result variances.



For more information:

https://github.com/microsoft/VirtualClient/

https://microsoft.github.io/VirtualClient/



Learn more


Continue reading...
 
Back
Top