Jump to content

Create a data solution on Azure Synapse Analytics with Snapshot Serengeti - Part 2 (Analytics)


Recommended Posts

Guest Jcardif
Posted

Author(s):

 

Josh Ndemenge is Cloud Developer Advocate for Data at Microsoft and David Abu is a Cloud Advocate Power BI at Microsoft.

 

 

 

This is the second blog in a four-part series on building an end-to-end data analytics and machine learning solution on Azure Synapse Analytics. If you haven't already, be sure to check out the first blog at Create a Data Solution on Azure Synapse Analytics with Snapshot Serengeti - Part 1 before proceeding.

 

 

 

In the first blog, we covered how to create the Synapse workspace and use notebooks to load data into Azure Data Lake Gen2 and the SQL Data Warehouse. In this blog, we will explore how to integrate with Power BI and the Azure Machine Learning service.

 

largevv2px999.png.dc988556237a9d38f951333c262bc2f7.png

 

 

 

To get started, let’s connect the SQL data warehouse to Power BI and create a few reports!

 

 

 

To connect to Azure Synapse Analytics using Power BI Desktop, first open the application and click on the “Get Data” button. Then, select “More” to see a wider range of data source options. In the search bar, type in “Synapse” to filter the options and select “Azure Synapse Analytics (SQL)” from the list.

 

 

 

mediumvv2px400.png.ca05be93b00087a7055e2eb4d07ff2db.png

 

 

 

Next you will be prompted to enter your server name. Type in the name of your server and then click on the “Direct Query” option.

 

*Note that Direct Query is a connection mode that allows you to query data directly from the data source in real-time, without the need to import it into Power BI.

 

 

 

On the other hand the Import mode, data is first loaded into Power BI’s internal data model before it can be queried and visualized. Check out DirectQuery in Power BI to learn more,

 

 

 

Click Ok and the open Power Query Editor to see the data.

 

 

 

mediumvv2px400.png.0fbd3c7475ca3d138d53cacf37b2ddc8.png

 

 

 

Click on the Annotations table, next on the dropdown next to Category Id uncheck 0 and 1. This is to remove the empty and human categories from the dataset.

 

 

 

Repeat this for the categories table, the click Close and Apply to navigate to the Power BI homepage.

 

 

Modeling data in Power BI

 

 

Our objective is to link the different tables within the model view to create a model link similar to the one below.

 

 

 

largevv2px999.thumb.png.f94bb8f0568b9b081564bfa693931530.png

 

 

 

To model the data, follow these steps:

 

  1. Click Categories [id] and drag to connect to annotations[category_id]
  2. Click Categories [name] and drag to connect to train[category_name] and Val[category_name]
  3. Click images[id] and drag to connect to annotations[image_id] and in properties, make the cross-filer direction to be BOTH.

 

mediumvv2px400.png.5670c3c48a96be8941439fe6848a2eec.png

 

 

 

  1. Click images[id] and drag to connect to train[image_id] and Val[image_id]

 

 

 

Now we have completed the modelling of this data and we want to start analyzing the data. Click the top left report view icon to go back to the blank white canvas.

 

 

 

*Note: As of March 2023, the Power BI interface as changed, and you might notice during the exercise. Kindly update your Power BI desktop.

 

 

 

DAX measures

 

 

We will create a simple report and we will use some DAX measures to count the rows in the annotation, images, train and Val tables. To achieve this we’ll leverage the New Quick Measures AI functionality within Power BI

 

  1. Click on Quick Measure at the top.
  2. Click on Suggestions
  3. Type “count how many rows in the images table” and click Generate.

 

 

 

mediumvv2px400.png.7b2409037788389338d9e93aaee6886a.png

 

 

 

  1. Click Add.
  2. At the top bar, you can change the function name “measure” to “Number of images”.
  3. Create the DAX measurement for other tables using the quick measure AI tool.
    • Annotation
    • Train
    • Val

[*]Change the measures to appropriate names accordingly.

 

Creating charts

 

 

Next, we’ll create visualizations to explore the variations in animal images from Snapshot Serengeti across different seasons, locations and species.

 

 

 

To learn more about on-object visual, check out Use on-object interaction with visuals in your report (preview).

 

 

 

1. Click a card visual, click the measure called Images to the visual

 

mediumvv2px400.png.28eb27d1f28ef8ca98399e91ccd079c1.png

 

 

 

  • To access the editing mode, press the + icon next to the card visual.
  • Select more options from the menu.
  • You can now edit and interact with the visual.

 

 

 

2. Add 3 card visuals to display the measures created above:

 

  • Number of Trained
  • Number of Validation
  • Number of Annotation

 

 

 

3. Add 2 slicer visuals:

 

  • The first one for categories[name] and rename it to Animals -The second one for Annotation[season] and rename it to Season

  • mediumvv2px400.png.e1bf321e59594f90afd4fb6c4c968cd2.png
     
     

 

4. To show the Annotation count by Animals, use a clustered bar chart.

 

  • Select the clustered bar chart option.
  • On the right data pane, choose Categories[name] and Number of Annotation

 

mediumvv2px400.png.f19db3654b1e9feb6890fd6aa60ae8bb.png

 

 

 

5. To show the Annotation count by Season, use a clustered bar chart.

 

  • Select the clustered bar chart option.
  • On the right data pane, choose Annotation[season] and Number of Annotation

 

 

 

6. To show the images count by location, use a clustered bar chart.

 

  • Select the clustered bar chart option.
  • On the right data pane, choose images[location] and DAX measure: Images.

 

 

 

7. To show the images count by season, use a clustered bar chart.

 

  • Select the clustered bar chart option.
  • On the right data pane, choose Annotation[season] and DAX measure: Images.

 

 

 

8. To compare the Train and Val tables, use a Line and Clustered Column chart.

 

  • Select the Line and Clustered Column chart option.
  • X-axis: name
  • Column Y-axis: number of train
  • Line Y axis: Number of Val
  • Filter it to Top 5 by using the filter pane. See the picture below.

 

mediumvv2px400.png.940c250788b5eb0624be17031479be45.png

 

 

 

 

 

Finally, this results in:

 

largevv2px999.png.8490edcfc505c7f9b6d233cfe7902b8f.png

 

 

 

Now that we have created the Power Bi reports, publish them to the Power BI service.

 

 

 

Power BI linked services in Azure Synapse Analytics

 

 

Navigate back to the synapse workspace and click on the “Linked Services” option under the “Manage” section.

 

 

 

largevv2px999.png.f2f213347df1e93022d7335498befd71.png

 

 

 

 

 

Now you can access your Power BI reports directly in Azure Synapse Analytics. Check out Quickstart: Linking a Power BI workspace to a Synapse workspace to learn more.

 

 

Conclusion

 

 

In this article, we've covered how to link Power BI to Azure Synapse Analytics to create a data pipeline, as well as how to create a Power BI report and publish it to the Power BI service. In the upcoming articles, we'll explore how to link the Azure Machine Learning service with Azure Synapse Analytics and train your ML models.

 

 

Resources

 

 

For additional resources to get an in-depth understanding of the services discussed in this article take a look at this handy collection of resources:

 

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...