J
jasonschuff
Introduction
This blog will utilize manual steps on for exporting data to blob storage in order to retain specific POSIX attributes. The exporting of data is achieved using the Lustre HSM (Hierarchical Storage Management) interface. The Managed Lustre system will need to have HSM enabled and setup in advance. See this article: Blob integration
For more information around setting up automatic synchronization to Azure BLOB Storage for Azure Managed Lustre refer to this blob post: Automatic Synchronization to Azure BLOB Storage.
Connect client to the Lustre file system
Client machines running Linux can access Azure Managed Lustre directly. See the following article that details the client prerequisites: Connect client to the file system
To mount lustre:
Code:
sudo mount -t lustre -o noatime,flock <MGS_IP>@tcp:/lustrefs /<client_path>
Migrate data retaining POSIX attributes
Once you have a client that is connected to the file system you can now copy data directly into that file system.
- Assuming the source location is /mydata and the destination lustre file system is /lustredata
- The -a option preserves all POSIX attributes, such as ownership, permissions, timestamps, symlinks, etc. See the rsync manual page for more details.
To copy data into lustre:
Code:
rsync -av /mydata /lustredata
Note: When migrating data to AMLFS, ensure that the total storage used does not exceed the system’s allowed capacity. If migrating more storage than allowed by the file system capacity then files will need to be archived and released to blob storage as needed before continuing the data migration.
Export data and attributes to blob storage
Once the files have been copied into the Lustre File system, now utilize the export job process in order to write those files as well as the POSIX attributes as metadata to the blob storage container. This process includes using the export jobs with archive process.
Which POSIX attributes are retained during an export job?
When you export files from your Azure Managed Lustre system to blob storage there are additional attributes that are saved as metadata inside the blob storage as shown here: Metadata for exported files. The following attributes may be written as metadata to each object in blob storage depending on the type of object:
Parameter | Description |
modtime | The last modification time of the file |
owner | The owner of the file |
group | The group owner of the file |
permissions | The existing permissions of the file |
hdi_isfolder | If object is a folder, this value is set to true. Name corresponds with folder name. |
The metadata will appear in the blob attributes in storage as shown here:
Restoring data into a new Azure Managed Lustre File System:
Now that the blob storage contains the attributes for each blob object including permissions and ownership of each file and directory, this data can be imported into any new Azure Managed Lustre file system and retain those attributes as it does. Follow these steps in order to import data using import jobs.
Note: This step is only required when setting up a new Azure Managed Lustre File System. This is not required for utilizing the existing AMLFS the data was originally copied to.
References
- Azure Managed Lustre File System Documentation
- Azure Managed Lustre with Automatic Synchronisation to Azure BLOB Storage
- GitHub repositories
Continue reading...