Jump to content

Featured Replies

Posted

Introducing Azure CycleCloud Slurm Workspace (Preview)

 

A new way to create and manage CycleCloud Slurm clusters on Azure

 

 

 

Slurm is one of the most popular and widely used open-source workload managers for AI/HPC and cloud computing. Slurm enables users to run large-scale parallel and distributed applications across a set of compute nodes, and provides features such as job scheduling, resource management, fault tolerance, and power management. Slurm is used by many of the world's top supercomputers, research institutes, universities, and enterprises.

 

 

 

However, setting up and managing Slurm clusters on the cloud can be challenging and time-consuming, especially for users who are not familiar with the cloud environment or the Slurm configuration. Users must deal with tasks such as provisioning and scaling compute nodes, installing and updating Slurm software, configuring network and storage, monitoring cluster health and performance, and troubleshooting issues. These tasks can distract users from their core research or business objectives and reduce the productivity and efficiency of their AI/HPC workloads.

 

 

 

What is Azure CycleCloud Slurm Workspace?

 

Azure CycleCloud Slurm Workspace is a new solution that simplifies and streamlines the creation and management of Slurm clusters on Azure. Azure CycleCloud Slurm Workspace is an Azure marketplace solution template that allows users to easily create and configure pre-defined Slurm clusters with Azure CycleCloud, without requiring any prior knowledge of the cloud or Slurm. Slurm clusters will be pre-configured with PMix v4, Pyxis and enroot to support containerized AI Slurm jobs. Users can access the provisioned login node using SSH or Visual Studio Code to perform common tasks like submitting and managing Slurm jobs.

 

 

 

What are the benefits of Azure CycleCloud Slurm Workspace?

 

Azure CycleCloud Slurm Workspace offers several benefits for users who want to run Slurm workloads on Azure, such as:

 

 

  • Easy and fast cluster creation: Users can create Slurm clusters on Azure in minutes, by following a few simple steps in the GUI. This must be compared to days or weeks of work in the past without Azure CycleCloud Slurm Workspace. Users can choose from a variety of Azure virtual machine (VM) sizes and types and customize the cluster settings such as the number of nodes, the network configuration, the storage options from Azure NetApp Files to Azure Managed Lustre Filesystem, and the Slurm parameters.

  • Flexible and dynamic cluster management: Slurm clusters will be scaled up or down by Azure CycleCloud. Users can also monitor the cluster status, performance, and utilization, and view the cluster logs and metrics in the GUI. Users can also delete their Slurm clusters when they are no longer needed, and only pay for the resources they use.
     

 

How to get started with Azure CycleCloud Slurm Workspace?

 

Azure CycleCloud Slurm Workspace is currently in preview, and we invite you to try it out and share your feedback with us. To get started, you need to have an Azure subscription. You can sign up to be whitelisted Create your initial Azure subscriptions, and follow the instructions Slurm Scheduler Integration to deploy and configure your Slurm cluster using Azure CycleCloud Slurm Workspace.

 

 

 

We hope you enjoy using Azure CycleCloud Slurm Workspace, and we look forward to hearing from you. Please send us feedback, questions, and suggestions through the Azure CycleCloud forum or the Azure CycleCloud support channel. We appreciate your participation and input in making Azure CycleCloud Slurm Workspace a better product for you and the AI/HPC community.

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...