Get started with your first HDInsight on AKS cluster

  • Thread starter Thread starter Abhishjain
  • Start date Start date
A

Abhishjain

HDInsight on AKS allows you to deploy popular Open-Source Analytics workloads like Apache Spark, Apache Flink, and Trino

Let's see the how we can create a cluster in HDInsight on AKS in just few clicks, and just in few minutes!!


Before that, let's ensure the basics of required registrations for the service to work are turned on:



  1. Tenant registration: This is 'one time' operation required to provide access to HDInsight on AKS first party (1P) app.
    • Run a single line command given in this documentation and you are set.
    • Creating the cluster without this, may result in an error like below
      ""--The identity of the calling application could not be established. please run command to provision the 1p service principle on the new tenant to onboard."
  2. Feature registration: This is required in case the subscription you are using has never used Azure Kubernetes Services (AKS) and its certain features like "AKS-AzureKeyVaultSecretsProvider", "Pod Identity", "Kubelet Disk" in your subscription.
    • Follow the steps mentioned here to register for these features.

You are all set to spin your first HDInsight on AKS cluster.



Lets make it easy for you - here are the ARM templates or Azure portal links to get going!

We will show you how to create a Spark cluster using Azure portal.

Keep the following resources handy.

  • User assigned managed identity (MSI).
  • ADLS gen2 storage and a container.
  • You can opt to create these resources using ready to use ARM templates



  1. During creation, you are required to select the cluster pool.

    HDInsight on AKS introduces concept of cluster pool. It is a logical way to organize all your clusters of different workloads under a single umbrella. This means, you can have Apache Spark, Apache Flink, and Trino clusters under a single cluster pool making easy to manage all the clusters and realize the concept of a lakehouse architecture. For the first time, you need to create a cluster pool and then, you can use the same pool to create other clusters!

    large?v=v2&px=999.png

  2. Provide user assigned managed identity, storage account and SKU.
  3. Create the cluster.

small?v=v2&px=200.pngThat's it!!! Your first cluster is up and running, in just a few minutes medium?v=v2&px=400.png.

large?v=v2&px=999.png






Let us know, how was your creation experience for your favorite analytics workloads.



We are super excited to get you started:


  • Join our community, share an idea or share your success story - Sign Up | LinkedIn
  • Have a question on how to migrate or want to discuss a use case - Microsoft Forms

Continue reading...
 
Back
Top