Jump to content

Relationships between ADX tables on two databases or two clusters


Recommended Posts

Guest DanyHoter
Posted

Cross-database and cross-cluster relationships in PBI over ADX tables

 

 

 

Summary

 

 

You may have more than one ADX database and probably more than one ADX cluster.

 

In some cases, you want to join tables or functions from more than one database/cluster.

 

In this article you’ll see how to make sure that such joins are folded and sent to the ADX backend instead of executing at the level of the Power Query mashup engine.

 

Everything mentioned here is applicable to Azure Data Explorer, Synapse Data Explorer, and Fabric RTA.

 

 

 

Cross-Database

 

 

We’ll use the nyc_taxi table in the samples database in the help cluster.

 

| started by creating a query to this table in Power BI desktop,

 

I added a small dimension table to the ContosoSales database called rate_codes.

 

It can be used to add a name to the rate_code in the nyc_taxi table.

 

I created a query for this table in the same PBI model.

 

It is important to use the IsDimension=true setting in the dimension query.

 

Next step is creating a relationship between the two tables based on the rate_code.

 

I created the relationship as 1:m with assume referential integrity.

 

There is a small number of rides with rate codes that do not appear in the dimension table.

 

The number of such rides is very small, and there is a performance advantage in using inner joins.

 

In the attached example crossdatabase.pbix the table takes more than 50% longer to refresh with leftouter join.

 

Out of the box database() notation

 

 

If you look at the queries generated by PBI using .show queries, you can see that the dimension table which is coming from a different database is mentioned as database('ContosoSales').["Rate_Codes"]

 

This is an automatic behavior by the connector and it allows the join to be constructed in a normal way.

 

 

 

Cross-cluster relationships

 

 

A more complicated case is when the two tables in a relationship are in two different clusters.

 

By default, PBI will not attempt to join the two tables in ADX although it is perfectly possible.

 

Any time that the cluster parameter is not the same in two queries, they are considered as two different sources and no joins will be generated between them.

 

In the attached file crosscluster.pbix, there are two queries with a relationship.

 

When the transactions are filtered by countries, PBI will create a list with all customer keys from the selected countries and create a where clause “where CustomerKey in(11,22,33…)”

 

The list of customers can be very long and affect performance.

 

You can see this behavior in the attached file crosscluster.pbix.

 

First, you need to copy the customers table to your own cluster using:

 

.set Customers <| cluster(“help”).database(‘ContosoSales’).Customers

 

When opening the pbix, it will not work until you change the values of two parameters:

 

 

 

mediumvv2px400.png.f46e94db9708151393425f1cbca881c9.png

 

 

 

 

 

After you click OK PBI will connect to each of the two clusters and will not use a join between them.

 

 

 

How to trigger a cross cluster join from PBI

 

 

 

 

One way would be to write the query with lookup in Kusto web explorer and export to PBI.

 

SalesFact

 

| lookup kind=inner

 

cluster('Your Cluster').database( 'Your Database').Customers

 

on CustomerKey

 

 

 

 

 

 

 

In the UI it will be

 

mediumvv2px400.png.398e2baebb84c48c85d6a869ba50e7cc.png

 

 

 

Both the fact and the dimension are in the context of the same cluster, but the third parameter includes a fully qualified table including a cluster and a database.

 

Another and simpler way used in the attached file crossfilter1.pbix is to create a query that uses the same cluster and database as fir the fact table, and in the table name dialog use the notation with cluster and database.

 

Again, I used parameters that you can edit and enter your cluster and database.

 

The M query will look like this in the advanced editor let

 

Source = AzureDataExplorer.Contents("Azure Data Explorer","ContosoSales","cluster('" & DimCluster & "').database('" & DimDatabase & "').Customers", [isDimension=true])

 

in

 

Source

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...