Add a new Partition to a running CycleCloud SLURM cluster

  • Thread starter Thread starter Jerrance
  • Start date Start date
J

Jerrance

Overview​


Azure CycleCloud (CC) is a user-friendly platform that orchestrates High-Performance Computing (HPC) environments on Azure, enabling admins to set up infrastructure, job schedulers, filesystems and scale resources efficiently at any size. It's designed for HPC administrators intent on deploying environments with specific schedulers.



SLURM, a widely-used HPC job scheduler, is notable for its open-source, scalable, fault-tolerant design, suitable for Linux clusters of any scale. SLURM manages user resources, workloads, accounting, monitoring, and supports parallel/distributed computing, organizing compute nodes into partitions.



This blog will specifically explain how to integrate a new partition into an active SLURM cluster within CycleCloud, without the need to terminate or restart the entire cluster.

Jerrance_0-1722712333744.png



Requirements/Versions:​

  • CycleCloud Server (CC version used is 8.6.2)
  • Cyclecloud cli initialized on the CycleCloud VM
  • A Running Slurm Cluster
    • CycleCloud project used is 3.0.7
    • Slurm version used is 23.11.7-1
  • SSH and HTTPS access to CycleCloud VM

High Level Overview​

  1. Git clone the CC SLURM repo (not required if you already have a slurm template file)
  2. Edit the Slurm template to add a new partition
  3. Export parameters from the running SLURM cluster
  4. Import the updated template file to the running cluster
  5. Activate the new nodearray(s)
  6. Update the cluster settings (VM size, core count, Image, etc)
  7. Scale the cluster to create the nodes

Step 1: Git clone the CC SLURM repo​


SSH into the CC VM and run the following commands:

sudo yum install -y git
git clone GitHub - Azure/cyclecloud-slurm: Azure CycleCloud project to enable users to create, configure, and use Slurm HPC clusters.
cd cyclecloud-slurm/templates
ll

Jerrance_1-1722712333750.png



Step 2: Edit the SLURM template to add new partition(s)​


Use your editor of choice (ie. vi, vim, Nano, VSCode remote, etc) to edit the “slurm.txt” template file:

cp slurm.txt slurm-part.txt
vim slurm-part.txt



The template file nodearray is the CC configuration unit that associates to a SLURM partition. There are 3 nodearrays defined in the default template:



hpc: tightly coupled MPI workloads with Infiniband (slurm.hpc = true)​

htc: massively parallel throughput jobs w/o Infiniband (slurm.hpc = false)​

dynamic: enables multiple VM types in the same partition​



Choose the nodearray type for the new partition (hpc or htc) and duplicate the [[[nodearray …]]] config section. For example, to create a new nodearray named “GPU” based on the hpc nodearray (NOTE: hpc nodearray configs included for reference):



[[nodearray hpc]]
Extends = nodearraybase
MachineType = $HPCMachineType
ImageName = $HPCImageName
MaxCoreCount = $MaxHPCExecuteCoreCount
Azure.MaxScalesetSize = $HPCMaxScalesetSize
AdditionalClusterInitSpecs = $HPCClusterInitSpecs
EnableNodeHealthChecks = $EnableNodeHealthChecks

[[[configuration]]]
slurm.default_partition = true
slurm.hpc = true
slurm.partition = hpc

[[nodearray GPU]]
Extends = nodearraybase
MachineType = $GPUMachineType
ImageName = $GPUImageName
MaxCoreCount = $MaxGPUExecuteCoreCount
Azure.MaxScalesetSize = $HPCMaxScalesetSize
AdditionalClusterInitSpecs = $GPUClusterInitSpecs
EnableNodeHealthChecks = $EnableNodeHealthChecks

[[[configuration]]]
slurm.default_partition = false
slurm.hpc = true
slurm.partition = gpu
slurm.use_pcpu = false



NOTE: there can only be 1 “slurm.default_partition” and by default it is the HPC nodearray. Set the new one to false, or if you set it to true then change the HPC nodearray to false.



The “variables” in the nodearray config (ie. $GPUMachineType) are referred to as “Parameters” in CC. The Parameters are attributes exposed in the CC GUI to enable per cluster customization. Further down in the template file begins the Parameters configuration beginning with [parameters About] section. We need to add several configuration blocks throughout this section to correspond to the Parameters defined in the nodearray (ie. $GPUMachineType).



Add the GPUMachineType from HPCMachineType:



[[[parameter HPCMachineType]]]
Label = HPC VM Type
Description = The VM type for HPC execute nodes
ParameterType = Cloud.MachineType
DefaultValue = Standard_F2s_v2

[[[parameter GPUMachineType]]]
Label = GPU VM Type
Description = The VM type for GPU execute nodes
ParameterType = Cloud.MachineType
DefaultValue = Standard_F2s_v2



Add the GPUExecuteCoreCount from HPCExecuteCoreCount:



[[[parameter MaxHPCExecuteCoreCount]]]
Label = Max HPC Cores
Description = The total number of HPC execute cores to start
DefaultValue = 100
Config.Plugin = pico.form.NumberTextBox
Config.MinValue = 1
Config.IntegerOnly = true

[[[parameter MaxGPUExecuteCoreCount]]]
Label = Max GPU Cores
Description = The total number of GPU execute cores to start
DefaultValue = 100
Config.Plugin = pico.form.NumberTextBox
Config.MinValue = 1
Config.IntegerOnly = true



Add the GPUImageName from HPCImageName:



[[[parameter HPCImageName]]]
Label = HPC OS
ParameterType = Cloud.Image
Config.OS = linux
DefaultValue = almalinux8
Config.Filter := Package in {"cycle.image.centos7", "cycle.image.ubuntu20", "cycle.image.ubuntu22", "cycle.image.sles15-hpc", "almalinux8"}

[[[parameter GPUImageName]]]
Label = GPU OS
ParameterType = Cloud.Image
Config.OS = linux
DefaultValue = almalinux8
Config.Filter := Package in {"cycle.image.centos7", "cycle.image.ubuntu20", "cycle.image.ubuntu22", "cycle.image.sles15-hpc", "almalinux8"}



Add the GPUClusterInitSpecs from HPCClusterInitSpecs:

[[[parameter HPCClusterInitSpecs]]]
Label = HPC Cluster-Init
DefaultValue = =undefined
Description = Cluster init specs to apply to HPC execute nodes
ParameterType = Cloud.ClusterInitSpecs

[[[parameter GPUClusterInitSpecs]]]
Label = GPU Cluster-Init
DefaultValue = =undefined
Description = Cluster init specs to apply to GPU execute nodes
ParameterType = Cloud.ClusterInitSpecs



NOTE: Keep in mind that you can customize the "DefaultValue" for parameters as per your requirements, or alternatively, you can make changes directly within the CycleCloud graphical user interface.

Save the template file and exit (ie. :wq for vi/vim).



Step 3: Export parameters from the running SLURM cluster​


You now have an updated SLURM template file to add a new GPU partition. The template will need to be “imported” into CycleCloud to overwrite the existing cluster definition. Before doing that, however, we need to export all the current cluster GUI parameter configs from the cluster into a local json file to use in the import process. Without this json file the cluster configs are all reset to the default values specified in the template file (and overwriting any customizations applied to the cluster in the GUI).



From the CycleCloud VM run the following command format:

cyclecloud export_parameters cluster_name > file_name.json



For my cluster the specific command is:

cyclecloud export_parameters jm-slurm-test > jm-slurm-test-params.json
cat jm-slurm-test-params.json
{
"UsePublicNetwork" : false,
"configuration_slurm_accounting_storageloc" : null,
"AdditionalNFSMountOptions" : null,
"About shared" : null,
"NFSSchedAddress" : null,
"loginMachineType" : "Standard_D8as_v4",
"DynamicUseLowPrio" : false,
"configuration_slurm_accounting_password" : null,
"Region" : "southcentralus",
"MaxHPCExecuteCoreCount" : 240,
"NumberLoginNodes" : 0,
"HTCImageName" : "cycle.image.ubuntu22",
"MaxHTCExecuteCoreCount" : 10,
"AdditionalNFSExportPath" : "/data",
"DynamicClusterInitSpecs" : null,
"About shared part 2" : null,
"HPCImageName" : "cycle.image.ubuntu22",
"SchedulerClusterInitSpecs" : null,
"SchedulerMachineType" : "Standard_D4as_v4",
"NFSSchedDiskWarning" : null,
…<truncated>
}



If the cyclecloud command does not work you may need to initialize the cli tool as described in the docs: Install the Command Line Interface - Azure CycleCloud



Step 4: Import the updated template file to the running cluster​


To import the updated template to the running cluster in CycleCloud run the following command format:

cyclecloud import_cluster <cluster_name> -c Slurm -f <template file name> txt -p <parameter file name> --force



For my cluster the specific command is:

cyclecloud import_cluster jm-slurm-test -c Slurm -f slurm-part.txt -p jm-slurm-test-params.json --force

Jerrance_2-1722712333752.png



In the CycleCloud GUI we can now see the “gpu” nodearray has been added. Click on the “Arrays” tab in the middle panel as shown in the following screen capture:

Jerrance_3-1722712333755.png



The gpu nodearray is added to the cluster but it is not yet “Activated,” which means it is not yet available for use.



Step 5: Activate the new nodearray(s)​


The cyclecloud start_cluster command will now kickstart the new nodearray activation using the following format:

cyclecloud start_cluster <cluster_name>



For my cluster the command is:

cyclecloud start_cluster jm-slurm-test

Jerrance_4-1722712333756.png



From the CycleCloud GUI we will see the gpu nodearray status will move to “Activation” and finally “Activated:”

Jerrance_5-1722712333759.png

Step 6: Update the cluster settings​


Edit the cluster settings in the CycleCloud GUI to pick the “GPU VM Type” and “Max GPU Cores” in the “Required Settings” section:

Jerrance_6-1722712333764.png



Update the “GPU OS” and “GPU Cluster-Init” as needed in the “Advanced Settings” section:

Jerrance_7-1722712333769.png



Step 7: Scale the cluster to create the nodes​


To this point we added the new nodearray to CycleCloud but SLURM does not yet know about the new GPU partition. We can see this from the scheduler VM with the sinfo command:

Jerrance_8-1722712333770.png





The final step is to “scale” the cluster to “pre-define” the compute nodes as needed by SLURM. The CycleCloud azslurm scale command will accomplish this:

Jerrance_9-1722712333771.png



Your cluster is now ready to use the new GPU partition.

SUMMARY​


Adding a new partition to SLURM with Azure CycleCloud is a flexible and efficient way to update your cluster and leverage different types of compute nodes. You can follow the steps outlined in this article to create a new nodearray, configure the cluster settings, and scale the cluster to match the SLURM partition. By using CycleCloud and SLURM, you can optimize your cluster performance and resource utilization.



References:
CycleCloud Documentation

CycleCloud-SLURM Github repository

Microsoft Training for SLURM on Azure CycleCloud

SLURM documentation

Continue reading...
 
Back
Top