Jump to content

Lesson Learned #250: All started with the phrase: In PowerBI Direct Query is slow-ColumnStore Index


Recommended Posts

Guest Jose_Manuel_Jurado
Posted

In some situations, customers that are using PowerBI and Direct Query reported performance issues depending how the query has been defined by PowerBI. In this scenario, I would like to share with you how we fixed this performance issue using ColumnStore Index and Partitioning.

 

 

 

We are going to use two techonologies that will help a lot in our performance issues ColumnStore Index is the standard for storing and querying large data warehousing fact tables. This index uses column-based data storage and query processing to achieve gains up to 10 times the query performance in your data warehouse over traditional row-oriented storage. You can also achieve gains up to 10 times the data compression over the uncompressed data size and Partitioning if we our queries are filtering by date.

 

 

 

For this example, I download a demo database Release Wide World Importers sample database v1.0 · microsoft/sql-server-samples · GitHub and duplicate rows of a table Fact.Sale until having 234.767.360 rows. I choose HyperScale Database tier basically as a medium size database for OLAP.

 

 

 

In every analysis of performance with PowerBI, if you need to know how many rows we have per table use the following TSQL, instead of using SELECT COUNT() for performance improvements.

 

 

 

 

 

SELECT

t.NAME AS TableName,

s.Name AS SchemaName,

max(CASE i.type WHEN 5 THEN si.rowcnt ELSE p.rows END) AS RowCounts,

SUM(a.total_pages) * 8 AS TotalSpaceKB,

SUM(a.used_pages) * 8 AS UsedSpaceKB,

(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB

FROM

sys.tables t

INNER JOIN

sys.indexes i ON t.OBJECT_ID = i.object_id

INNER JOIN

sysindexes si ON t.OBJECT_ID = si.id

INNER JOIN

sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id

INNER JOIN

sys.allocation_units a ON p.partition_id = a.container_id

LEFT OUTER JOIN

sys.schemas s ON t.schema_id = s.schema_id

WHERE t.is_ms_shipped = 0

AND i.OBJECT_ID > 255

GROUP BY

t.Name, s.Name

ORDER BY

t.Name, s.Name

 

 

 

Define the report.

 

 

In this situation, we have a report where we need to obtain per Fiscal Month Label, City and Stock Item the Total sales including and excluding Tax. Filtering by FY2013-Aug,FY2013-Feb and FY2013-Jan

 

 

 

709x372vv2.png.c17606a761fdcb3c23faf4cee9dbfbe0.png

 

PowerBI generates the following TSQL statement.

 

 

Using Azure Data Studio and SQL Server Profiler extension we could see the query.

 

 

 

 

 

 

SELECT

TOP (1000001) *

FROM

(

 

SELECT [t1].[City] AS [c8],[t3].[Fiscal Month Label] AS [c43],[t5].[stock Item] AS [c59],SUM([t7].[Total Including Tax])

AS [a0],SUM([t7].[Total Excluding Tax])

AS [a1]

FROM

(

(

((

select [$Table].[sale Key] as [sale Key],

[$Table].[City Key] as [City Key],

[$Table].[Customer Key] as [Customer Key],

[$Table].[bill To Customer Key] as [bill To Customer Key],

[$Table].[stock Item Key] as [stock Item Key],

[$Table].[invoice Date Key] as [invoice Date Key],

[$Table].[Delivery Date Key] as [Delivery Date Key],

[$Table].[salesperson Key] as [salesperson Key],

[$Table].[WWI Invoice ID] as [WWI Invoice ID],

[$Table].[Description] as [Description],

[$Table].[Package] as [Package],

[$Table].[Quantity] as [Quantity],

[$Table].[unit Price] as [unit Price],

[$Table].[Tax Rate] as [Tax Rate],

[$Table].[Total Excluding Tax] as [Total Excluding Tax],

[$Table].[Tax Amount] as [Tax Amount],

[$Table].[Profit] as [Profit],

[$Table].[Total Including Tax] as [Total Including Tax],

[$Table].[Total Dry Items] as [Total Dry Items],

[$Table].[Total Chiller Items] as [Total Chiller Items],

[$Table].[Lineage Key] as [Lineage Key]

from [Fact].[saleColumnStoreIndex] as [$Table]

) AS [t7]

 

INNER JOIN

 

(

select [$Table].[City Key] as [City Key],

[$Table].[WWI City ID] as [WWI City ID],

[$Table].[City] as [City],

[$Table].[state Province] as [state Province],

[$Table].[Country] as [Country],

[$Table].[Continent] as [Continent],

[$Table].[sales Territory] as [sales Territory],

[$Table].[Region] as [Region],

[$Table].[subregion] as [subregion],

convert(nvarchar(max), [$Table].[Location]) as [Location],

[$Table].[Latest Recorded Population] as [Latest Recorded Population],

[$Table].[Valid From] as [Valid From],

[$Table].[Valid To] as [Valid To],

[$Table].[Lineage Key] as [Lineage Key]

from [Dimension].[City] as [$Table]

) AS [t1] on

(

[t7].[City Key] = [t1].[City Key]

)

)

 

 

INNER JOIN

 

(

select [$Table].[Date] as [Date],

[$Table].[Day Number] as [Day Number],

[$Table].[Day] as [Day],

[$Table].[Month] as [Month],

[$Table].[short Month] as [short Month],

[$Table].[Calendar Month Number] as [Calendar Month Number],

[$Table].[Calendar Month Label] as [Calendar Month Label],

[$Table].[Calendar Year] as [Calendar Year],

[$Table].[Calendar Year Label] as [Calendar Year Label],

[$Table].[Fiscal Month Number] as [Fiscal Month Number],

[$Table].[Fiscal Month Label] as [Fiscal Month Label],

[$Table].[Fiscal Year] as [Fiscal Year],

[$Table].[Fiscal Year Label] as [Fiscal Year Label],

[$Table].[iSO Week Number] as [iSO Week Number]

from [Dimension].[Date] as [$Table]

) AS [t3] on

(

[t7].[Delivery Date Key] = [t3].[Date]

)

)

 

 

INNER JOIN

 

(

select [$Table].[stock Item Key] as [stock Item Key],

[$Table].[WWI Stock Item ID] as [WWI Stock Item ID],

[$Table].[stock Item] as [stock Item],

[$Table]. as ,

[$Table].[selling Package] as [selling Package],

[$Table].[buying Package] as [buying Package],

[$Table].[brand] as [brand],

[$Table]. as ,

[$Table].[Lead Time Days] as [Lead Time Days],

[$Table].[Quantity Per Outer] as [Quantity Per Outer],

[$Table].[is Chiller Stock] as [is Chiller Stock],

[$Table].[barcode] as [barcode],

[$Table].[Tax Rate] as [Tax Rate],

[$Table].[unit Price] as [unit Price],

[$Table].[Recommended Retail Price] as [Recommended Retail Price],

[$Table].[Typical Weight Per Unit] as [Typical Weight Per Unit],

[$Table].[Photo] as [Photo],

[$Table].[Valid From] as [Valid From],

[$Table].[Valid To] as [Valid To],

[$Table].[Lineage Key] as [Lineage Key]

from [Dimension].[stock Item] as [$Table]

) AS [t5] on

(

[t7].[stock Item Key] = [t5].[stock Item Key]

)

)

 

WHERE

(

([t3].[Fiscal Month Label] IN (N'FY2013-Aug',N'FY2013-Feb',N'FY2013-Jan'))

)

 

GROUP BY [t1].[City],[t3].[Fiscal Month Label],[t5].[stock Item]

)

AS [MainTable]

WHERE

(

 

NOT(

(

[a0] IS NULL

)

)

OR

NOT(

(

[a1] IS NULL

)

)

 

)

 

 

 

Re-define the table to use columnstore index and partitioning.

 

 

I created a new table called Fact.SaleColumnStoreIndex using columnstore index and partitioning. Once I have this table I inserted all the rows from the table Fact.Sale to see the impact and compare the results.

 

 

 

 

 

CREATE TABLE [Fact].[saleColumnStoreIndex](

[sale Key] [bigint] IDENTITY(1,1) NOT NULL,

[City Key] [int] NOT NULL,

[Customer Key] [int] NOT NULL,

[bill To Customer Key] [int] NOT NULL,

[stock Item Key] [int] NOT NULL,

[invoice Date Key] [date] NOT NULL,

[Delivery Date Key] [date] NULL,

[salesperson Key] [int] NOT NULL,

[WWI Invoice ID] [int] NOT NULL,

[Description] [nvarchar](100) NOT NULL,

[Package] [nvarchar](50) NOT NULL,

[Quantity] [int] NOT NULL,

[unit Price] [decimal](18, 2) NOT NULL,

[Tax Rate] [decimal](18, 3) NOT NULL,

[Total Excluding Tax] [decimal](18, 2) NOT NULL,

[Tax Amount] [decimal](18, 2) NOT NULL,

[Profit] [decimal](18, 2) NOT NULL,

[Total Including Tax] [decimal](18, 2) NOT NULL,

[Total Dry Items] [int] NOT NULL,

[Total Chiller Items] [int] NOT NULL,

[Lineage Key] [int] NOT NULL,

index [saleColumnStoreIndex_CC] CLUSTERED COLUMNSTORE

) ON myPartitionScheme([Delivery Date Key])

 

GO

 

CREATE PARTITION FUNCTION myDateRangePF (date)

AS RANGE RIGHT FOR VALUES ('2013-01-01','2014-01-01','2015-01-01','2016-01-01')

GO

 

CREATE PARTITION SCHEME myPartitionScheme

AS PARTITION myDateRangePF ALL TO ([PRIMARY])

GO

 

INSERT INTO [Fact].[saleColumnStoreIndex]

([City Key]

,[Customer Key]

,[bill To Customer Key]

,[stock Item Key]

,[invoice Date Key]

,[Delivery Date Key]

,[salesperson Key]

,[WWI Invoice ID]

,[Description]

,[Package]

,[Quantity]

,[unit Price]

,[Tax Rate]

,[Total Excluding Tax]

,[Tax Amount]

,[Profit]

,[Total Including Tax]

,[Total Dry Items]

,[Total Chiller Items]

,[Lineage Key])

SELECT

[City Key]

,[Customer Key]

,[bill To Customer Key]

,[stock Item Key]

,[invoice Date Key]

,[Delivery Date Key]

,[salesperson Key]

,[WWI Invoice ID]

,[Description]

,[Package]

,[Quantity]

,[unit Price]

,[Tax Rate]

,[Total Excluding Tax]

,[Tax Amount]

,[Profit]

,[Total Including Tax]

,[Total Dry Items]

,[Total Chiller Items]

,[Lineage Key]

FROM [FACT].[sale]

GO

 

CREATE INDEX SaleColumnStoreIndex_City

ON Fact.SaleColumnStoreIndex

(

[City Key]

)

 

CREATE INDEX SaleColumnStoreIndex_Date

ON Fact.SaleColumnStoreIndex

(

[Delivery Date Key]

)

 

CREATE INDEX SaleColumnStoreIndex_Stock_Item_key

ON Fact.SaleColumnStoreIndex

(

[stock Item Key]

)

 

CREATE INDEX SaleColumnStoreIndex_Customer_key

ON Fact.SaleColumnStoreIndex

(

[Customer Key]

)

 

CREATE INDEX SaleColumnStoreIndex_SalesPerson_Key

ON Fact.SaleColumnStoreIndex

(

[salesPerson Key]

)

 

 

 

 

 

Besides improving the performance and reducing the time taken in the report we could see how we reduce the total size of the table:

 

 

 

622x277vv2.png.cecf904bf0860723d0329d78db148769.png

 

 

 

Enjoy!

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...