Jump to content

Azure VNET vs Azure IR inside of Pipelines


Recommended Posts

Guest Liliam_Leme
Posted

I was working with a customer this week who wanted to understand why the ADF inside of the managed virtual network was slower than the one created on the Azure IR. This was not the first time a customer asked me about that so I thought I would blog about it.

 

 

 

When you create an Azure integration runtime inside of the VNet (virtual network), it will be provisioned with the managed virtual network. It will use private endpoints to securely connect and ensure data integration is isolated and secure.

 

 

 

The service will start the IR inside of the managed VNet and that takes time (Fig 1). It will be something like this:

 

largevv2px999.png.dde0041ca47aed32ec1230c3fd19188a.png

 

Fig 1.

 

 

 

 

 

Conceptually, VNet or not VNet will work differently as it is documented here:

 

 

 

"Checking the details of pipeline runs, you can see that the slow pipeline is running on Managed VNet (Virtual Network) IR while the normal one is running on Azure IR. By design, Managed VNet IR takes longer queue time than Azure IR as we are not reserving one compute node per service instance, so there is a warm up for each copy activity to start, and it occurs primarily on VNet join rather than Azure IR."

 

 

 

 

 

In order to demonstrate this, I duplicated a pipeline in 2 environments: one with VNet and another one without VNet. Both were copying the same files from the same folder inside of the same storage account.

 

 

 

Note: At the time this was first written, TTL (Time-to-Live) was not an option for Copy activity.

 

 

 

 

 

Performance Results

 

 

 

 

Performance With VNet (Fig 2):

 

largevv2px999.png.4d94a9ec9563dd52d1899fdd6434aa2a.png

 

Fig 2

 

 

 

 

 

Performance Without VNet (Fig 3):

 

largevv2px999.png.4d90c9f296a4ed739a5a50c59d0739e9.png

 

Fig 3

 

 

 

 

 

Update: In June 2022 it was released a new feature to handle the warm-up for copy activity. I tested and it really made a huge difference in the copy activity execution time. Read more about this update here.

 

 

 

Conclusion

 

 

As Fig 2 and Fig. 3 show, there is a considerable difference in queue time, besides the throughput ( which varies). Queue time = Compute Node Warm-up and that usually takes more time inside of a managed VNet.

 

 

 

However, If you want to avoid the copy activity warm-up, check out the TTL preview feature for ADF.

 

 

 

I hope this clarifies things!

 

 

 

Liliam

 

UK Engineer

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...