Guest RuiCunha Posted September 7, 2022 Posted September 7, 2022 Introduction Azure Synapse Studio is the primary tool to use to interact with the many components that exist in Azure Synapse Analytics, allowing you to perform a wide range of activities against your data and build a fully integrated analytics solution. Integrating Synapse Studio with a Source Control System such as Azure DevOps Git or Github has been shown as one of Studio’s preferred features to leverage collaborative work and source control. Working collaboratively and tracking code changes in an integrated analytics platform that combines data warehousing, big data analytics, data integration, and visualization, can be quite challenging. Multidisciplinary teams, working in different projects/features, working in complex applicational lifecycles requiring agility and and…automation! Looking at this figure below, illustrating the Synapse CICD lifecycle for Workspace Artifacts, you can see some manual steps preventing a fully automated process. Figure 1: Current CICD flow in Synapse The goal of this article is to deep dive into the new cool features recently introduced in Synapse CICD (Synapse Workspace Deployment Task) V2(preview) to automate the publishing step of the process, allowing you to deploy the code from any user branch without any manual intervention from the UI. Synapse Workspace Artifact Deployment: the Past Let’s take a closer look at the current CICD flow in Synapse before the V2 release of Synapse Workspace Deployment task. And let’s use this simple scenario as an example: two developers, John and Mary, are working in different projects and features, developing their code in a single Synapse Workspace. Figure 2: Source control and publishing in Synapse Workspace John has been developing a new feature for Project X, which must be deployed and tested in (UAT) tomorrow. Mary’s has also been developing a new feature but for Project Y, which is scheduled for deployment and acceptance only next week. Since a few weeks ago, both developers have been developing their code in their own feature branches, publishing their changes, executing their code, making sure everything is working fine before deploying their features to the UAT environment. The day for deploying John’s feature has come. John is about to trigger the DevOps release pipeline, to deploy his code to UAT, but before that, he wants to review the ARM templates that are going to be deployed. He realizes that Mary’s artifacts are already part of the ARM templates and they are not supposed to be deployed until next week.John really wanted to cherry pick his feature, but since the Workspace Deployment task V.1 requires the ARM templates generated from the collaboration branch, Mary’s artifacts will also be included in the deployment. It’s an all or nothing approach. So, when using the Workspace Deployment Task v1: You need to manually hit the publish button in Synapse Studio to generate the ARM templates Developers working in different projects/features cannot cherry pick their own code to be deployed to a target workspace The Synapse Workspace Deployment task will need the ARM templates generated in the workspace publish branch to kick off the Continuous Delivery process Synapse Workspace Artifact Deployments : the Present and...the future! The Workspace Deployment Task v2 (in Preview at the time of writing) now introduces some new cool deployment modes (aka Operation Types): “Validate” (only available in YAML pipelines) and “Validate and Deploy”. These operations will facilitate the CICD automation, introducing a new CICD flow. Figure 3: The new CICD workflow in Synapse I will explain in more detail the rationale behind each operation type and some of the applicable use cases: Validate Operation The Validation operation only works in the YAML pipeline. The goal of the Validate operation is to validate the files in a non-publish branch and export the Workspace ARM templates as pipeline artifacts. This is useful when you want to automate the validation and generation of ARM templates from any user branch. Before V2 this could only be possible from the collaboration branch, and you had to do this manually, from Synapse Studio. Validate and Deploy Operation The goal of the Validate and Deploy operation is to deploy the artifacts to a target from any user branch (a non-publish branch). It does the job of the Validate operation (generating the ARM templates based on the specified branch) and adds this extra deploy step to deploy only the artifacts from that branch. This is a useful operation when you want to cherry pick the code that you want to deploy from your lower environment to your target environment, bypassing the manual Publish operation. Before V2, the deploy operation would only consider the ARM code published from the collaboration branch. In V2 the task can consider the code from any non-publish branch! Deploy Operation This operation remains unchanged from V1: the goal of the Deploy operation is to take the ARM templates manually generated using the Publish action in Synapse Studio (from the collaboration branch) and deploy the artifacts to a target environment. In this operation you will not be able to deploy any branch code separately, you will be deploying the code published from your collaboration branch only. Automating the publish action from any user branch: demonstration I'm sharing below a couple of use cases where you can benefit from using the “Validate and Deploy” feature to automate the publish action of Synapse Workspace artifacts from any user branch: Use case#1: Master branch and publish branch constraints When you need to orchestrate the code that you have recently developed in a feature branch, the typical flow is to create a pull request to merge your code with the master(collaboration) branch and then, from this branch, you manually hit the publish button to persist your code in Live Mode. This will also generate the ARM template files in the publish branch. This can be a showstopper in scenarios where you are not allowed to push your changes directly to the master (collaboration) branch or you simply don’t want those changes to be propagated to the ARM templates in the publish branch. This will no longer be a problem if you use this new “Validate and Deploy” feature, as it will allow you to publish your code directly to Live Mode, bypassing the merge operation with the master branch and the ARM template generation to the publish branch. Here’s an example on how you can make it work: Create a new release pipeline and add an artifact pointing to your feature branch: Set up a Continuous deployment trigger This will allow you to choose whether you want this release to be triggered whenever a push occurs in your user branch or when you create a Pull Request. Add a new stage to your release pipeline to add and configure the Synapse Workspace Deployment task Configure the Synapse Workspace Deployment task as follows: 1. From the Task version selection dropdown, select “2 (preview) 2. Select the “Validate and deploy” operation type 3. Select the root folder of the user branch that you want to automate the publish of code In this example, I’m using my dev workspace as the target for this deployment, as I want to automate the publishing of my code to the dev environment. Once you finish configuring the task, just save your changes, don’t create the release manually. Going back to the feature branch, let’s make a minor change in a notebook and commit the changes. Let’s take a quick walkthrough on the dev workspace: Selecting the feature branch: Selecting the Main Branch As expected, no code from the feature branch has been merged with the main branch Switching to Live Mode: You can see that the code from your feature branch has been published Use case#2: Cherry picking the code to be deployed to a target environment This is a very common use case, where the need arises to isolate the tests and the acceptance of the code that is being developed in the source environment. If we look at the Deploy flow (figure above), the deployment operation will use the ARM templates from the publish branch, that contain all features published in our development environment. If we deploy these ARM templates to the target environment, we might be delivering unnecessary features to the target environment. We can, however, take advantage of this new "Validate and Deploy" feature to cherry pick and publish the code that we want to test in the target environment. Here’s an example: Consider Mary's "featureA" ready to be tested in UAT environment. Several Project X features have been already published in Live Mode by her teammate John, but John’s code is not supposed to go to UAT yet, only Mary's code (Project Y). If Mary follows the typical CICD flow, merging her code with the collaboration branch and publishing her changes, the generated ARM templates in the publish branch will contain both John’s and Mary’s code. These ARM templates cannot be used in this use case, as they will deploy John’s code to the UAT environment as well. So, we need to take advantage of the “Validate and Deploy” new feature to achieve Mary's goals to deploy her code only. In this example, as part of the branching strategy, we will be using a "gateway" branch, called, "UAT-ready". This is a branch that will be used to merge the code from any feature that is ready to be deployed to the UAT environment. We will use this "gateway" branch as a target branch filter to trigger our release pipeline. You will get more details on this below. Here’s an example on how we can do that: Create a new release pipeline and add an artifact pointing to your "gateway" branch: Set up a Continuous deployment trigger The branching strategy in place, will allow Mary to merge her code from her feature branch to the "gateway" “UAT ready” branch. You can configure the continuous deployment trigger as follows: Add a new stage to your release pipeline to add and configure the Synapse Workspace Deployment task Configure the Synapse Workspace Deployment task as follows: 1. From the Task version selection dropdown, select “2 (preview) 2. Select the “Validate and deploy” operation type 3. Select the root folder of the “UAT ready” branch. This is where you are merging your code through a Pull Request. In this second use case, I’m using my UAT workspace as the target for this deployment. Save your changes and do not create any release. Now you are going to trigger this release pipeline by creating a Pull Request, merging your code from you feature branch to the “UAT-Ready”branch IMPORTANT: When creating this Pull Request, don’t forget to set the “UAT-Ready” as the target branch for your Pull Request After creating your pull request, you will see your release pipeline kicking off. Complete your Pull Request to save your changes. Once the release is completed, you can go to your UAT environment to confirm your code changes have been successfully published. And, as expected, no sign of any code related to Project Y being published in DEV environment And with this last step, we conclude this second use case. Conclusion Automating the publish action has been one of the most challenging tasks in Synapse CICD. By introducing these new operation types (“Validate” and “Validate and Deploy”) in the Workspace Deployment task V2, we are bringing new enhancements that will facilitate this automation. In this article, I have demonstrated how to take advantage of these new features using two simple scenarios. Combining creativity with engineering will bring new automation capabilities to your CICD use cases. Continue reading... Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.