Jump to content

Lesson Learned #266: Avoiding duplicate statistics


Recommended Posts

Guest Jose_Manuel_Jurado
Posted

Today, I worked on a service request that our customer reported that the statistics update maintenance plan is taking too much time. This is causing that all these operations (rebuild and update statistics) are taking more time that the maintenace windows that they have. I would like to share with you my findings how to reduce this time.

 

 

 

I found we're updating too many statistics and we worked in two main operations:

 

 

 

  • Avoiding to update statistics of are related to indexes:
    • I saw that our customer is rebuilding the indexes and after it updating all statistics. If you already rebuilt the indexes remember that the statistics associated with this index will be updated too. So, there is not needed to invest resources in terms of memory, CPU and IO to perform this operation.

    [*]Duplicate Statistics:

    • I saw a high number of statistics in their table, around 250 statistics and also, they have a high number of indexes in the table. I asked if they created these indexes based on performance needs or they executed previously a performance test with their application to find out these indexes. They mentioned that they are creating indexes depending on performance basis.
    • In this case, I suggested to review if they have already an user/automatic statistics created before creating any index, Because they could have duplicate statistics one user/automatic statistics and another one index related. For example,

 

 

 

We have the following table:

 

 

 

CREATE TABLE MyData (ID INT IDENTITY(1,1) PRIMARY KEY, [Name] VARCHAR(80))

INSERT INTO myData ([name]) valueS(replicate('Y',70));

GO 1000

INSERT INTO myData ([name]) valueS(replicate('M',70));

GO 1000

INSERT INTO myData ([name]) valueS(replicate('C',70));

GO 1000

INSERT INTO myData ([name]) valueS(replicate('A',70));

GO 1000

 

 

 

 

 

if our application runs any query using the column name, for example, SELECT * FROM MyData where [name]='YMCA', we are going to see that an automatic statistics has been created for this column called _WA_Sys_00000002_2116E6DF

 

 

 

528x404vv2.png.cae48e31e8cb51ef82f11eacb5c8bca2.png

 

 

 

542x516vv2.png.8628f84b76efea0c05020fec11ab8935.png

 

 

 

Futher, if based on performance needs you need to create an index in this table based on this column, you are going to have the following situation:

 

 

 

CREATE INDEX MyData_ix1

ON MyData

(

Name

)

 

 

 

2 statistics: One related with user/automatic statistic and other one related with index.

 

 

 

mediumvv2px400.png.dd09692dc78c7d918744c11e490bc11a.png

 

 

 

So, dropping the unnecessary automatic statistics, our customer reduces time for updating the statistics. In this case, my lesson learned here, is check if you have an existing user/automatic statistics to avoid this scenario and check your script to avoid this situation.

 

 

 

Enjoy!

 

Continue reading...

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...