Jump to content

Azure Synapse Analytics CI/CD with Custom Parameters - Made Easy!


Recommended Posts

Guest jehayes
Posted

Introduction

 

 

Custom parameter templates for Azure Synapse Analytics Workspace deployments allow you to parameterize Synapse Pipeline features and settings that are not exposed in the default deployment parameters template. In this blog post, I will demonstrate how to:

 

  • Create Synapse Pipelines that simplify the creation of a custom parameters template
  • Create the custom parameters template
  • Deploy the custom parameters template as part of an Azure DevOps Release Pipeline for Synapse Workspace deployments

 

largevv2px999.jpg.684ad683c1c11a92ac5914ad0bc887ef.jpg

 

If you are new to Azure Synapse Analytics Workspace Deployments, I suggest you start with these articles to gain an understanding of CI/CD with Synapse and the Azure DevOps Release Pipeline task for Azure Synapse Workspace Deployments:

 

 

Overview

 

 

When integrating Synapse Analytics Workspaces with Azure DevOps Git, you have the advantages of both Source Control and CI/CD for deployments to other Synapse Environments. When you publish a Git-Enabled Synapse Workspace, two ARM Template files are created in your Git repo workspace_publish branch: TemplateForWorkspace.json and TemplateParametersForWorkspace.json:

 

 

 

largevv2px999.png.a7343cba7f4d79da50fdfacbbc132f77.png

 

 

 

By default, a limited set of values are exposed as parameters in the TemplateParametersForWorkspace.json file, such as linked services connection strings and trigger parameter values. Furthermore, Global Parameters are not available in Synapse Pipelines like they are in Azure Data Factory.

 

 

 

When deploying Synapse Workspaces to other environments (such as uat, prod, etc)., you may need to configure different values in your Synapse pipelines for that environment. For example, let’s say I have a Synapse pipeline that does not fit a metadata-driven pattern; however, I still need a pipeline activity to have a different value for different environments. Or perhaps I have a fully baked metadata-driven pipeline but I want my pipeline to perform a Lookup activity on my metadata table based upon the Synapse Workspace environment.

 

 

 

To override the default parameter template, you can create a custom parameter template that must be named template-parameters-definition.json and place that in the root folder of your collaboration branch in your Git repo:

 

largevv2px999.png.75bbc19f26548acd57cb427841c69a91.png

 

 

 

Creating custom parameter templates is well defined here. However, it can be a bit cumbersome to parse through the entire workspace template to figure out how to construct the parameter definitions for each feature you want to parameterize. I found an easier approach: Create pipeline parameters for any setting that you want to parameterize that supports dynamic expressions (and what doesn’t in Synapse?) Then create a simple json script for just pipeline parameter values. This also provides more transparency on what settings are parameterized – like environment variables in SSIS.

 

 

 

In this blog post, we’ll cover:

 

  1. Creating and leveraging pipeline parameters in the Synapse Workspace

    1. Decide what features need to be parameterized by environment
    2. Create a pipeline parameter for each feature
    3. Use the pipeline parameter in the pipeline activity feature’s dynamic expression

[*]Creating the template-parameters-definition.json file to expose the necessary Synapse pipeline parameters

[*]Overriding the parameter values in Synapse Workspace Deployment task for the DevOps release pipeline

Creating and leveraging pipeline parameters in the Synapse Workspace

 

 

I have 2 examples in my Synapse Analytics Workspace:

 

  1. A pipeline which calls a Spark Notebook where the storage account and the size of the Spark pool’s executors and driver vary by environment
  2. A metadata-driven pipeline where some activity setting values vary by environment

 

For the first pipeline, I added 4 Pipeline Parameters:

 

 

 

largevv2px999.png.d635d8a81611584163366636dbbd6895.png

 

Python uses a different storage endpoint than Synapse linked services so I need to construct the endpoint in my Notebook. The storage account will be different based upon the environment, so I need to parameterize it:

 

 

 

largevv2px999.png.cccd8dc871b614665636bc90a9b164d0.png

 

On the Notebook Activity, I set the notebook parameters values to be the pipeline parameter values. Though I have a medium spark pool defined in both my dev and uat Synapse workspaces, I want to save money by only running a small executor and driver size in dev. In uat, we’ll want to use medium.

 

 

 

largevv2px999.png.d5c9ee2c461feb5d478595e39b1d6df3.png

 

 

 

My second pipeline is a full metadata-driven pipeline which leverages a control table in SQL for Copy Data and Dataflow Activities. I also want to have a single table for some static values that don’t vary by pipeline entities. I want this table to hold values for all environments rather than separate tables or databases for each environment.

 

 

 

largevv2px999.png.602c848460e916a89867f2e45b068526.png

 

 

 

In this pipeline, I have a single parameter called Environment:

 

 

 

largevv2px999.thumb.png.a7beac06698c6a3de06619a806d2d216.png

 

 

 

The pipeline has a Lookup activity that queries my table to get the values for that environment:

 

 

 

largevv2px999.png.c18c8ee6f99ec7c463db2fb40701b5b4.png

 

 

 

Here’s the full query dynamic expression:

 

 

 

largevv2px999.png.a6e3790eb532f1fe33407bf4ed0c9bcb.png

 

 

 

select * from dbo.ParameterLookup where PipelineName = '@{pipeline().Pipeline}' and Environment = '@{pipeline().parameters.Environment}'

 

 

 

The results from the Lookup are then used to set a variable for the folder name:

 

 

 

largevv2px999.png.1866fe5ee00496252d98b69e3bb52716.png

 

largevv2px999.png.8333b18540860deb12953de127728510.png

 

The variable along with the other Lookup activity outputs are used in the rest of my pipeline:

 

largevv2px999.png.e7c2ef71d24257d09a6a6eda59ad0ead.png

 

 

 

largevv2px999.png.4161c8101c9c2764cc56b143ecf66e3b.png

 

 

 

largevv2px999.png.a750185132ae630eee99b4bb0ea2543b.png

 

 

 

Next for the fun (and easy) part.

 

 

 

Creating the template-parameters-definition.json file

 

 

After committing all the changes to the main or collaboration branch, go to the main or collaboration branch in the DevOps Git repo and create a new file in the root directory called template-parameters-definition.json

 

 

 

largevv2px999.png.761505e5b73f44c447c35b73c430103b.png

 

 

 

To expose ALL pipeline parameters to DevOps, the contents of the file would simply contain:

 

 

 

{

"Microsoft.Synapse/workspaces/pipelines": {

"properties": {

"parameters": {

"*": {

"*": "="

}

}

}

}

}

 

 

 

 

However, in my case, I only want to parameterize the values for my Environment, StorageAccount, and SparkExecutorSize parameters so my json file contains only those parameters:

 

 

 

{

"Microsoft.Synapse/workspaces/pipelines": {

"properties": {

"parameters": {

"Environment": {

"defaultValue": "="

},

"StorageAccount": {

"defaultValue": "="

},

"SparkExecutorSize": {

"defaultValue": "="

}

}

}

}

}

 

 

 

 

The next time I publish my Synapse workspace, the pipeline parameters are added to the TemplateParametersForWorkspace.json file:

 

 

 

largevv2px999.thumb.png.7e7ed01f14ce4828111967f3c3728f76.png

 

 

 

Overriding the parameter values in Synapse Workspace Deployment task

 

 

In the DevOps release pipeline for my UAT environment, I created variables for the 3 pipeline parameters with the values for my UAT environment:

 

 

 

largevv2px999.png.39a8a11694dcdc23262739a1dcbf7b10.png

 

 

 

In the Synapse Deployment release pipeline task, I override the default values of my Synapse pipeline parameters with the variables:

 

largevv2px999.png.60b45b04c3c6c238c2b954868ca34f82.png

 

 

 

After I save my release pipeline and the next time my UAT environment is deployed, whether manually or through continuous CI/CD triggers, my Synapse UAT Workspace will have the new values!

 

largevv2px999.png.3dc6c9dbb08bdf2d117d5bf68381ff8a.png

 

 

 

largevv2px999.png.4eb35879d58c2399cb737e61ee99fa97.png

 

 

 

That’s it! Easy-peasy!

 

 

 

Summary

 

 

For Synapse pipeline activities that have settings which vary by environment, adding parameters to the Synapse pipelines simplifies the process of creating a custom parameter template that will be overridden in your Azure DevOps Release Pipeline Azure Synapse Workspace deployments.

 

 

 

If you are new to Synapse Git integration and DevOps, I also recommend these resources:

 

CI/CD in Azure Synapse Analytics

 

 

Automating the Publishing of Workspace Artifacts in Synapse CICD - Microsoft Community Hub

 

 

 

For creating custom parameters template for features beyond pipeline parameters check out:

 

CICD Automation in Synapse Analytics: taking advantage of custom parameters in Workspace Templates - Microsoft Community Hub

 

 

 

I hope you enjoyed this article and welcome any feedback!

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...