Jump to content

MatLab and Azure: A Match Made in Performance Heaven


Recommended Posts

Guest Kent Altena
Posted

mediumvv2px400.png.e2caa70bff327349409b8e821f7e6448.pngUsed in many industries, including engineering, mathematics, and finance, MatLab is a proprietary programming language and multi-paradigm numerical computing environment. With the increasing complexity of data analysis, simulation, and modeling tasks, the performance of MatLab plays a crucial role in the speed and accuracy of these operations. Microsoft Azure offers a cloud-based platform that provides virtual machines (VMs) to run MatLab. However, selecting the right VM SKU can be difficult, and choosing an incorrect one can lead to suboptimal performance and potentially higher costs. In this blog article, we'll discuss how processor selection and other factors may affect MatLab's performance and how to choose the right Azure VM SKU to achieve the best performance for your MatLab workloads. We'll also explore some best practices to optimize MatLab performance on Azure VMs.

 

For background, MatLab, short for Matrix Laboratory, is a numerical computing environment developed by MathWorks. MatLab provides a wide range of tools for performing calculations, data analysis, visualization, and simulation tasks. It offers a high-level language that allows users to express complex mathematical computations easily and efficiently. With a vast library of built-in functions and toolboxes, MatLab provides a platform for solving complex engineering, scientific, and financial problems. MatLab's user-friendly interface, combined with its powerful computing capabilities, has made it a popular choice for researchers, engineers, and scientists across various industries.

 

 

 

The default answer to many organizations is to run MatLab calculations or simulations directly on the end-user workstations. However, for many reasons, this can be suboptimal as it leads to over-provisioning the capability of the desktop environment, especially if using Terminal Services or Virtual Desktop Infrastructure. I worked last year with a very large Non-Government Organization (NGO) nonprofit, which had such an environment. Their VDI environment was a difficult to manage RDS environment with users sharing access to large compute nodes with capabilities sufficient to run their data scientists' jobs.

 

 

 

Offloading MatLab Workload to Dedicated Compute Nodes

 

 

 

largevv2px999.png.d46d5434cbf38b57313103ecfa65ec58.pngUtilizing MatLab Parallel Compute Services and its native HPC Pack integration allows the end user to optimize or right-size the front end to run the desktop client and optimize the back end to be large enough to handle simulations. Offloading MatLab calculations to HPC Pack can significantly improve its performance and scalability. The HPC Pack provides a powerful platform for running parallel and distributed MatLab applications across a cluster of machines. Additionally, HPC Pack offers features such as job scheduling, data management, and native cloud orchestration to "AutoGrowShrink" to minimize compute costs when no jobs are in the queue. By utilizing the HPC Pack, users can take advantage of the full power of their cluster environment, enabling faster and more efficient data processing and analysis. The above NGO fixed their front end by implementing Azure Virtual Desktop (AVD) and implemented their compute infrastructure on HC44rs SKUs.

 

 

 

For the Compute Nodes running the calculations, there are several recommendations for performance. Primarily, MatLab is a compute-intensive program that requires enough memory to handle the size of the models. Being a multi-threaded application, MatLab benefits from having several physical cores available. In general, hyperthreading does not benefit the calculations once a sufficient number of cores are present. For an optimal memory-to-core ratio, it is important to know the size of companies model as any paging activity will seriously degrade performance. Local disk performance can also affect simulation performance as MatLab writes the results back out to disk. The general Azure recommendation is to utilize the local ephemeral disk for this transient data and ensure the Server Message Block (SMB) Share location is performant.

 

 

 

Benchmarking MatLab Workload

 

 

 

MATLAB provides a built-in benchmarking utility called bench, which measures the execution time of specific MatLab functions and compares them against standard reference values. The bench function evaluates different types of computation and tests various combinations of data sizes and algorithms to provide a comprehensive performance profile. The benchmarking process helps identify performance bottlenecks and guide optimization efforts, such as parallelizing computations or optimizing code. Use the MATLAB function timeit to help produce reliable and repeatable performance benchmarks. Use gputimeit to benchmark GPU code. Utilizing this bench, you can evaluate potential Azure SKUs in comparison to other Virtual Machine (VM) SKUs.

 

 

 

From a methodology standpoint, I ran the same Windows 2019 OS with the latest patches and MatLab version across all likely HPC VM SKUs. I disabled hyperthreading for any General Purpose SKU VM families utilizing metatags. I ran the benchmark command 3 times on each VM family and averaged the result. If a result was dramatically out of range in comparison to the other two, I threw out the bad result and ran the result one additional time. In each case, we used the local ephemeral drive to run the MatLab bench command.

 

 

 

Azure VMs being Benchmarked:

 

 


VM Name

HC44rs

HB120rs_v3

HB120rs_v2

D64ds_v5

D64ads_v5

Number of pCPUs

44 (Constrained Core 16, 32 options available)

120 (Constrained Core 16, 32, 64, 96 options available)

120 (Constrained Core 16, 32, 64, 96 options available)

32

32

Processor

Intel Xeon Platinum 8168

AMD EPYC 7V73X CPU cores (“Milan-X”)

AMD EPYC 7742 CPU cores

Intel® Xeon® Platinum 8370C (Ice Lake)

AMD's EPYC 7763v CPU Cores

Peak CPU Frequency

3.70 GHz

3.5 GHz

3.4 GHz

3.5 GHz

3.5 GHz

RAM per VM

352 GB

448 GB

456 GB

256 GB

256 GB

RAM per core

8 GB (22, 11GB)

3.75 GB (28, 14, 7, 4.6 GB)

3.8 GB (28, 14, 7, 4.6 GB)

8 GB

8 GB

Memory B/W

per core

4.3 GB/s

5.25 GB/s

2.9 GB/s

4.26 GB/s

4.26 GB/s

L3 Cache per VM

33MB

768MB

256MB

48MB

256MB

Attached Disk

1 x 700MB NVMe

2 x 0.9 TB NVMe

1 x 0.9 TB NVMe

2400 SSD

2400 SSD

Disk per Core

15.9GB (43.8, 21.8)

15GB (113, 56, 28, 19)

7.5 GB (56, 28, 14, 9)

75GB

75GB

Accelerated Networking

Yes

Yes

Yes

Yes

Yes

 

 

 

MatLab Benchmark Results:

 

VM SKU MatLAB: LU MatLAB: FFT MatLAB: ODE MatLAB: Sparse
HC44rs 0.2121 0.6646 0.2604 0.5576
HB120rs_v3 0.2236 0.401 0.2082 1.3275
HB120rs_v2 0.2309 0.3290 0.2482 1.5880
D64ds_v5 0.1697 0.23 0.1879 0.4406

D64ads_v5
0.2106 0.2809 0.1948 1.1102

 

 

 

For an explanation of what the columns are, I refer to the MatLab Benchmark page:

 


  1. LU (Lower-Upper Decomposition) Benchmark: The LU benchmark tests the performance of MATLAB for the lower-upper decomposition of large matrices. This benchmark involves factoring a matrix into lower and upper triangular matrices using different algorithms. Performance Factors: Floating-point, regular memory access
     
     

  2. FFT (Fast Fourier Transform) Benchmark: The FFT benchmark tests the performance of MATLAB for computing the fast Fourier transform of large data sets. This benchmark involves transforming a time-domain signal into its frequency-domain representation. The results of the FFT benchmark are influenced by the size of the input data set and the complexity of the signal being transformed. Performance Factors: Floating-point, irregular memory access
     
     

  3. ODE (Ordinary Differential Equation) Benchmark: The ODE benchmark tests the performance of MATLAB for solving systems of ordinary differential equations. This benchmark involves simulating the behavior of a physical system over time using differential equations. The results of the ODE benchmark are influenced by the complexity of the system being modeled and the accuracy of the numerical methods used to solve the equations.
    Performance Factors: Data structures and MATLAB function files, Disk Performance
     
     

  4. Sparse Benchmark: The Sparse benchmark tests the performance of MATLAB for manipulating sparse matrices. This benchmark involves performing operations on matrices that have a large number of zero elements. The results of the Sparse benchmark are influenced by the size and sparsity of the input matrix, as well as the specific operation being performed.
    Performance Factors: Mixed integer and floating-point
     

 

Performance Comparison:

 

Utilizing HC44rs as a performance baseline, a result of 1.50 would be 150% of the performance of HC44rs Result.

 

mediumvv2px400.png.ce39312c0006a64a7dd4b1018f812c27.png

 

 

 

You may notice a third column for HB120rs_v3 for AVX2. There is some belief within MatLab circles that MatLab is "crippled" on AMD processors. That was not my experience. I tested the supposition by forcing MatLab into MKL Debug mode. I created an MS-DOS batch file to launch MatLab in AVX2 Mode

 

@echo off

set MKL_DEBUG_CPU_TYPE=5

matlab.exe

 

While performance was slightly higher (roughly 1-5% faster), it was within the margin of error for the result and was largely proven unnecessary.

 

 

 

Conclusion:

 

 

 

MatLab is a powerful computational tool used widely within Financial Services Industry (FSI) specifically. However, to achieve optimal performance and efficiency, it's crucial to understand the factors that affect MatLab's performance and how to optimize the workload for the hardware environment. Choosing the right Azure VM SKU, offloading computations to HPC Pack, benchmarking workloads, and optimizing MatLab code are all effective ways to improve MatLab's performance and scalability. Understanding your technical requirements and requirements for the computational environment will lead you to a specific SKU and whether or not to purchase a cloud savings plan or reserved instance for a portion of them. By following these best practices, MatLab users can reduce processing time, enhance data analysis and simulations, and ultimately improve their productivity and decision-making. Whether running MatLab on-premises or in the cloud, optimizing its performance is critical for data scientists' satisfaction and delivering results faster.

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...