Unmounting Azure Managed Lustre Filesystem in a CycleCloud HPC cluster using Azure Scheduled Events.

vinilv · Oct 4, 2023

There is a known behaviour in Lustre if a VM has the Lustre mounted and it gets evicted or deleted as part of workflow without releasing the filesystem lock. Lustre will keep the lock for the next 10 – 15 minutes before it releases. Lustre has a ~10-minute timeout period to release the LOCK. The other VMs (Lustre clients) using the same Lustre mount point might experience intermittent hung mounts for 10-15 mins.

I recently published a blog post where I discussed how to utilize Azure Scheduled Events effectively for cleanly unmounting Azure Managed Lustre in a VMSS or a SPOT VM. This approach helps prevent the same issues I previously described.

How to unmount Azure Managed Lustre filesystem using Azure Scheduled Events - Microsoft Community Hub

In this blog post, we will explore the process of unmounting Azure Managed Lustre in a CycleCloud HPC cluster using Azure Scheduled Events. This operation is triggered in response to Spot Instance eviction or a scale-down operation, particularly in autoscaled clusters.

This setting is crucial when utilizing AMLFS within the CycleCloud environment.

Starting from CycleCloud version 8.2.2, the CycleCloud can leverage Azure Scheduled Events for virtual machines (VMs). This functionality allows you to deploy a script on your VM, which will be automatically run whenever one of the supported events takes place.

Reference: Using Scheduled Events - Azure CycleCloud

First, we need to enable the terminate notification for Node Array. We need to update the cyclecloud slurm or PBS or Gridengine template and add EnableTerminateNotification = true and TerminateNotificationTimeout = 10m

[[nodearray hpc]]
Extends = nodearraybase
MachineType = $HPCMachineType
ImageName = $HPCImageName
MaxCoreCount = $MaxHPCExecuteCoreCount
Azure.MaxScalesetSize = $HPCMaxScalesetSize
AdditionalClusterInitSpecs = $HPCClusterInitSpecs
EnableTerminateNotification = true
TerminateNotificationTimeout = 10m

Import the template into the cyclecloud and start a new cluster. We need to add the following script in all the nodes to unmount the Lustre in the event of spot eviction or scale-down operation.

mkdir -p /opt/cycle/jetpack/scripts
cat >>/opt/cycle/jetpack/scripts/onTerminate.sh << EOF
#!/bin/sh
/bin/fuser -ku /LUSTRE_MOUNT_POINT
/bin/sleep 5
/bin/umount -l /LUSTRE_MOUNT_POINT
EOF
cat >>/opt/cycle/jetpack/scripts/onPreempt.sh << EOF
#!/bin/sh
/bin/fuser -ku /LUSTRE_MOUNT_POINT
/bin/sleep 5
/bin/umount -l /LUSTRE_MOUNT_POINT
EOF
chmod +x /opt/cycle/jetpack/scripts/onTerminate.sh
chmod +x /opt/cycle/jetpack/scripts/onPreempt.sh

I have developed a project called "cyclecloud-amlfs" with the goal of automating the entire process of installing Lustre clients, mounting the filesystem, and unmounting Lustre in the event of VM eviction or termination operations. This project is designed to simplify the management of Lustre on Azure.

Cyclecloud-amlfs v.1.0.0 supported client installation and Lustre mounting, while cyclecloud-amlfs v.2.0.0 was released to introduce support for scheduled events to unmount the Lustre filesystem.

Integrating Azure Managed Lustre Filesystem (AMLFS) into CycleCloud HPC Cluster

Within this project, there is an automatic feature that enables Azure scheduled events to handle SPOT eviction or node termination/scaling down in a VMSS. This ensures a clean unmount of the Azure Managed Lustre Filesystem (AMLFS) to maintain filesystem integrity.

Additionally, the project includes templates for Slurm, PBS, and Gridengine, as well as terminate notification settings. These features make it easier for CycleCloud users to seamlessly integrate Azure Managed Lustre Filesystem into their HPC clusters, enhancing the overall cluster management experience.

Continue reading...

Unmounting Azure Managed Lustre Filesystem in a CycleCloud HPC cluster using Azure Scheduled Events.

vinilv