Note that there are some explanatory texts on larger screens.

plurals
  1. POT-SQL bad performance with CTE
    primarykey
    data
    text
    <p>I have a performance question about Common table expressions in SQL Server. In our developer team we use a lot of chaining CTEs when building our queries. I am currently working on a query which had terrible performance. But I found out that if I in the middle of the chain inserted all the records up to that CTE in a temporary table instead and then continued but selecting from that temp table I improved performance significantly. Now I would like to get some help to understand if this type of change only applies to this specific query and why the two cases you will see below differ so much in performance. Or could we possibly overuse CTEs in our team and can we gain performance generally by learning from this case?</p> <p>Please try to explain to me exactly what is happening here...</p> <p>The code is complete and you will be able to run it on SQL Server 2008 and probably 2005 too. One part is commented out and my idea is that you can switch the two cases by comment out one or the other. You can see where to put your block comments, I have marked these places with <code>--block comment here</code> and <code>--end block comment here</code></p> <p>It is the slow performing case that is uncommented default. Here you are:</p> <pre><code>--Declare tables to use in example. CREATE TABLE #Preparation ( Date DATETIME NOT NULL ,Hour INT NOT NULL ,Sales NUMERIC(9,2) ,Items INT ); CREATE TABLE #Calendar ( Date DATETIME NOT NULL ) CREATE TABLE #OpenHours ( Day INT NOT NULL, OpenFrom TIME NOT NULL, OpenTo TIME NOT NULL ); --Fill tables with sample data. INSERT INTO #OpenHours (Day, OpenFrom, OpenTo) VALUES (1, '10:00', '20:00'), (2, '10:00', '20:00'), (3, '10:00', '20:00'), (4, '10:00', '20:00'), (5, '10:00', '20:00'), (6, '10:00', '20:00'), (7, '10:00', '20:00') DECLARE @CounterDay INT = 0, @CounterHour INT = 0, @Sales NUMERIC(9, 2), @Items INT; WHILE @CounterDay &lt; 365 BEGIN SET @CounterHour = 0; WHILE @CounterHour &lt; 5 BEGIN SET @Items = CAST(RAND() * 100 AS INT); SET @Sales = CAST(RAND() * 1000 AS NUMERIC(9, 2)); IF @Items % 2 = 0 BEGIN SET @Items = NULL; SET @Sales = NULL; END INSERT INTO #Preparation (Date, Hour, Items, Sales) VALUES (DATEADD(DAY, @CounterDay, '2011-01-01'), @CounterHour + 13, @Items, @Sales); SET @CounterHour += 1; END INSERT INTO #Calendar (Date) VALUES (DATEADD(DAY, @CounterDay, '2011-01-01')); SET @CounterDay += 1; END --Here the query starts. ;WITH P AS ( SELECT DATEADD(HOUR, Hour, Date) AS Hour ,Sales ,Items FROM #Preparation ), O AS ( SELECT DISTINCT DATEADD(HOUR, SV.number, C.Date) AS Hour FROM #OpenHours AS O JOIN #Calendar AS C ON O.Day = DATEPART(WEEKDAY, C.Date) JOIN master.dbo.spt_values AS SV ON SV.number BETWEEN DATEPART(HOUR, O.OpenFrom) AND DATEPART(HOUR, O.OpenTo) ), S AS ( SELECT O.Hour, P.Sales, P.Items FROM O LEFT JOIN P ON P.Hour = O.Hour ) --block comment here case 1 (slow performing) --With this technique it takes about 34 seconds. ,N AS ( SELECT A.Hour ,A.Sales AS SalesOrg ,CASE WHEN COALESCE(B.Sales, C.Sales, 1) &lt; 0 THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales ,A.Items AS ItemsOrg ,COALESCE(B.Items, C.Items, 1) AS Items FROM S AS A OUTER APPLY (SELECT TOP 1 * FROM S WHERE Hour &lt;= A.Hour AND Sales IS NOT NULL AND DATEDIFF(DAY, Hour, A.Hour) = 0 ORDER BY Hour DESC) B OUTER APPLY (SELECT TOP 1 * FROM S WHERE Sales IS NOT NULL AND DATEDIFF(DAY, Hour, A.Hour) = 0 ORDER BY Hour) C ) --end block comment here case 1 (slow performing) /*--block comment here case 2 (fast performing) --With this technique it takes about 2 seconds. SELECT * INTO #tmpS FROM S; WITH N AS ( SELECT A.Hour ,A.Sales AS SalesOrg ,CASE WHEN COALESCE(B.Sales, C.Sales, 1) &lt; 0 THEN 0 ELSE COALESCE(B.Sales, C.Sales, 1) END AS Sales ,A.Items AS ItemsOrg ,COALESCE(B.Items, C.Items, 1) AS Items FROM #tmpS AS A OUTER APPLY (SELECT TOP 1 * FROM #tmpS WHERE Hour &lt;= A.Hour AND Sales IS NOT NULL AND DATEDIFF(DAY, Hour, A.Hour) = 0 ORDER BY Hour DESC) B OUTER APPLY (SELECT TOP 1 * FROM #tmpS WHERE Sales IS NOT NULL AND DATEDIFF(DAY, Hour, A.Hour) = 0 ORDER BY Hour) C ) --end block comment here case 2 (fast performing)*/ SELECT * FROM N ORDER BY Hour IF OBJECT_ID('tempdb..#tmpS') IS NOT NULL DROP TABLE #tmpS; DROP TABLE #Preparation; DROP TABLE #Calendar; DROP TABLE #OpenHours; </code></pre> <p>If you would like to try and understand what I am doing in the last step I have a SO question about it <a href="https://stackoverflow.com/questions/10654508/t-sql-query-update-null-values">here</a>.</p> <p>For me case 1 takes about 34 seconds and case 2 takes about 2 seconds. The difference is that I store the result from S in a temp table in case 2, in case 1 I use S in my next CTE directly.</p>
    singulars
    1. This table or related slice is empty.
    1. This table or related slice is empty.
    plurals
    1. This table or related slice is empty.
    1. This table or related slice is empty.
 

Querying!

 
Guidance

SQuiL has stopped working due to an internal error.

If you are curious you may find further information in the browser console, which is accessible through the devtools (F12).

Reload