Jump to content

How to build customized container image for Azure Batch


Recommended Posts

Guest Jerry
Posted

Sometimes, users will need to install some necessary software/packages in the environment before the task is executed. This can be easily done by using Start task feature of Azure Batch.

 

 

 

But when there are many dependencies to be installed, for example 20 packages in Linux and the installation of some of them will take long time such as Tensorflow, it will cause additional problems such as long start task running time when Azure Batch starts Batch node every time or even possible timeout issue or start task failure issue.

 

 

 

In order to avoid this issue, user has two options: Custom image and Container. Both of these two features are supported in Azure Batch. The way of using custom image is already explained here. This blog will mainly explain how to use the container feature.

 

 

 

One additional advantage is that for some special scenarios when the Batch task (application) has strong dependencies on a specific Operation System (OS) version, such as the Linux Ubuntu 18.04 which will reach EOL on Apr 30th 2023, what user needs to further consider is only the compatibility between the container OS and the host OS(selected when you create Batch pool) if he selects the container feature. And it will be easier for user to recreate/modify the environment itself as he only needs to modify the Dockerfile, but not to recreate a new Virtual Machine to capture a new custom image.

 

 

 

In this blog, the container image will be based on Linux Ubuntu 20.04. Assume that the Batch application needs a specific Python package called numpy as dependency. This blog will install this package based on a requirement.txt file, capture it into a container image, create a Batch pool based on this container image and verify the environment is good by running Batch task.

 

 

 

Note: This is just one simplest example. For real situation, user will need to modify the Dockerfile and include much more his own files when he creates the container image.

 

 

 

Pre-requisites:

 

  • An Azure Batch account
  • An Azure Container Registry
  • The commands and/or the packages of installing dependencies
  • Docker Desktop on Windows (The example used in this blog will all be with Windows OS, including the computer where we create image, the container image OS and the host OS. For Linux system, please kindly change the configuration.)

 

 

 

Step 1: Create the own container image

 

  1. In order to create a new container image, user needs to prepare the following files:

  • The Docker File
  • Other dependency components which will be needed (For example the requirement.txt in this scenario)

 

Note: Please pay attention that the Dockerfile should be without any extension name. The Dockerfile.txt will not work.

 

mediumvv2px400.png.22a15e7c9b3619b8a4f5e3281672199d.png

Docker folder files

 

 

 

Dockerfile:

 

 

 

# Use an official Ubuntu 20.04 runtime as a base image

FROM ubuntu:20.04

 

# Set the working directory to /app

WORKDIR /app

 

# Copy the current directory contents into the container at /app

ADD . /app

 

# Install pip3

RUN apt-get update

RUN apt-get -y install python3-pip

 

# Install numpy package

RUN pip3 install numpy

 

# Make port 80 available to the world outside this container

# EXPOSE 80

 

 

 

 

 

requirement.txt:

 

mediumvv2px400.png.2500f97a4b9f1f9bafeeee2aa6bfe58c.png

requirement.txt context

 

 

 

Note: In rare situation, Batch task may need to wait until a request is sent from other client sides to Batch node. In this scenario, please kindly add the EXPOSE {port} line to export the port of the container to host machine. For more details about how to write Dockerfile, please check this
.

 

 

 

  1. Once the files are prepared, user can start creating the container image.

  • Open Docker Desktop and make sure to login

 

largevv2px999.png.3cadad5eb8009c212a0365e9837bcdeb.png

Docker Desktop UI

 

 

 

  • Right click the tray icon and select Switch to Linux containers.
  • Open Powershell in the folder containing Dockerfile and following command:

 

 

 

docker build -t {container_name} .

 

 

 

 

 

The expected result is like the following one. (This step might take lots of time to finish depending on the network speed as Docker is downloading the container image as base.)

 

largevv2px999.png.c9768cd40bd51c3fcf5a7bafe127d1bc.png

Build result of docker image

 

 

 

  • Find the username and password from Azure Container Registry and use the following command to login:

 

largevv2px999.png.7561412047925c831ff614845651deef.png

Container Registry Access Key page

 

 

 

 

 

docker login {registry_name}.azurecr.io -u {username} -p {password}

 

 

 

largevv2px999.png.567df147b1a734695406a8515576f793.png

Docker login command result

 

 

 

  • Use the following commands to upload the container from local Docker to Azure Container Registry.

 

 

 

docker tag {container_name} {registry_name}.azurecr.io/{repository}

docker push {registry_name}.azurecr.io/{repository}

 

 

 

largevv2px999.png.2408c38795b3aeaba8c1e105834b85ad.png

container image upload result

 

 

 

Until here, the container is successfully created and uploaded into Azure Container Registry.

 

 

 

Step 2: Use the container image to run Batch task

 

  • User can then create a new pool and configure the container related setting in this node. The other configuration can be set as usual.

 

mediumvv2px400.png.16e1080a3d3ead1ecfebc6a6347dd869.png

container registry page when creating Batch pool

 

 

 

largevv2px999.png.c51918eeec5b5aca0ff37f0e35b6ce98.png

Container registry list page

 

 

 

largevv2px999.png.9453db6d8995c5e2c0fa45607719787b.png

Batch pool configuration page

 

 

 

  • Once the node is idle, please add a new job with a new task and link it to this new Batch pool. While creating the new task, please remember to select the correct Image. The following example uses a command line to let container python environment import numpy package, set variable a with value 1, then print out the value of variable a.

 

largevv2px999.thumb.png.3b416d25805666e036d8538731be5aeb.png

Batch task creation page

 

 

 

The expected output should be just “1”.

 

mediumvv2px400.png.a4ad27d534db5af252b236b5335df8d1.png

stdout result

 

 

 

This proved that our container is successfully created and the installation of python package is also good, otherwise there will be an error message reporting the module not found like following one.

 

mediumvv2px400.png.f6d4f1411fd618ff91aa28256181bd88.png

error message if package is not installed

 

 

 

Conclusion

 

 

The real user scenario will be more complicated, but the principal way of creating the customized container will be the same. The example Dockerfile in this blog shows how to:

 

  1. Run command to install dependencies
  2. Copy the necessary dependency files into Docker image

 

These two steps are also the main steps which user needs to customize.

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...