Posts Tagged While Loop

Recursive CTE vs While Loop – Performance Analysis

When it comes to Recursive CTEs in SQL Server there are often misconceptions about their performance, especially compared to While Loops. In this post we’ll explore two iterative scenarios and explore the performance of both a Recursive CTE solution and a While Loop solution. Its important to understand that both a Recursive CTE and While Loop both follow a linear pattern. That means that the duration and use of resource will grow linearly with increases in the number of iterations/recursions/rows required to process the query. The linear growth of both is where a lot of the misconceptions occur. While Loops are clear iterative processes and its easy to see that the growth would be linear, but a Recursive CTE is not immediately obvious as being linear because of the way they are written with an anchor and a recursive element. The thing to remember is that a Recursive CTE is the equivalent of writing multiple selects with union all statements in between. So each recursion is essentially a new union all and select statement.By thinking about Recursive CTEs as multiple union all statements it is easier to understand the linear growth.

Now onto the performance test results. There are two scenarios for this performance test. The first scenario is a data generation scenario and the second scenario is a row concatenation scenario. The data generation scenario will illustrate performance for an extreme number of recursions/iterations as only one row of data will be generated for each iteration. The row concatenation scenario will illustrate a more average scenario where the number of recursions/iterations is low, maxing out at 30 iterations, but across a wide range of rows of data.

Looking at the Data Generation scenario we can easily see that the Recursive CTE and While Loop have equal durations as the variance is under 10.00% and even at 10 million iterations/recursions/rows the Recursive CTE is only 3 seconds faster than the While Loop. Additionally we can see that the Recursive CTE is making better use of CPU than the While Loop but the While Loop makes better use of Reads. This is typical of Recursive CTEs, they frequently out perform other query types in CPU usage, but the reads are always higher. Its important to note though that the reads are all from buffer cache, there is no actual disk IO occurring. The charts depicting the ratios of each metric per row shows conclusively the linear growth of both query types have equal performance.

Finally in the Row Concatenation scenario that more accurately mimics a real world scenario for both query types we see the Recursive CTE really shinning. Now only is it substantially more efficient with its CPU usage, but we see almost 200% better efficiency on duration. Like before though there are substantially more reads with the Recursive CTE, but again these are from buffer cache and no actual disk IO is occurring. The data also suggests that additional rows of data would see the Recursive CTE actually out perform the While Loop in terms of Reads as well.

Hopefully this performance analysis will help to dispel the misconceptions about Recursive CTEs and their performance.

Since there is a lot of results for the performance tests I’ve attached the full performance results that includes additional charting and metrics as well as the full break out of the data. Each test was run three times and then an average taken. Additional below is the full code samples used for testing.

Recursive CTE vs While Loop – Data Generator – Performance Analysis

Recursive CTE vs While Loop – Row Concatenator Performance Analysis


 

Data Generation Scenario:

Generates N rows of parent child key combinations using N iterations/recursions.

Recursive CTE

SET NOCOUNT ON;

DECLARE

@Iteration INT = 0

, @MaxIterations INT = 10000000

, @Partitioner INT = 1

, @ParentKey INT = 0

, @ChildKey INT = 0;

CREATE TABLE #GeneratedData

(

[ParentKey] INT

, [ChildKey] INT

);

————————————————————————————————————–

WITH

[GeneratedData] AS

(

SELECT

@Iteration AS [Iteration]

, @Partitioner AS [Partitioner]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Iteration

ELSE @ParentKey

END AS [ParentKey]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Partitioner 1

ELSE @Iteration % @Partitioner

END AS [ChildKey]

UNION ALL

SELECT

GD.[Iteration] + 1 AS [Iteration]

, CASE

WHEN ((GD.[Iteration] + 1) % GD.[Partitioner]) = 0

THEN ((GD.[Partitioner] + 1) % 30) + 1

ELSE GD.[Partitioner]

END AS [Partitioner]

, CASE

WHEN ((GD.[Iteration] + 1) % GD.[Partitioner]) = 0

THEN GD.[Iteration] + 1

ELSE GD.[ParentKey]

END AS [ParentKey]

, ((GD.[Iteration] + 1) % GD.[Partitioner]) AS [ChildKey]

FROM

[GeneratedData] AS GD

WHERE

GD.[Iteration] < @MaxIterations

)

————————————————————————————————————–

INSERT INTO

#GeneratedData

([ParentKey], [ChildKey])

SELECT

GD.[ParentKey] AS [ParentKey]

, GD.[ChildKey] AS [ChildKey]

FROM

[GeneratedData] AS GD

OPTION

(MAXRECURSION 0);

————————————————————————————————————–

DROP TABLE #GeneratedData;

 


 

 

While Loop

SET NOCOUNT ON;

DECLARE

@Iteration INT = 0

, @MaxIterations INT = 10000000

, @Partitioner INT = 1

, @ParentKey INT = 0

, @ChildKey INT = 0;

CREATE TABLE #GeneratedData

(

[ParentKey] INT

, [ChildKey] INT

);

————————————————————————————————————–

WHILE

@Iteration < @MaxIterations

BEGIN

————————————————————————————————————–

INSERT INTO

#GeneratedData

([ParentKey], [ChildKey])

SELECT

CASE

WHEN @Iteration % @Partitioner = 0

THEN @Iteration

ELSE @ParentKey

END AS [ParentKey]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Partitioner 1

ELSE @Iteration % @Partitioner

END AS [ChildKey];

————————————————————————————————————–

SELECT

@Iteration = @Iteration + 1

, @Partitioner =

CASE

WHEN (@Iteration % @Partitioner) = 0

THEN ((@Partitioner + 1) % 30) + 1

ELSE @Partitioner

END

, @ParentKey =

CASE

WHEN (@Iteration % @Partitioner) = 0

THEN @Iteration + 1

ELSE @ParentKey

END

, @ChildKey = (@Iteration % @Partitioner);

END;

————————————————————————————————————–

DROP TABLE #GeneratedData;


 

Performance Results

Recursive CTE vs While Loop - Data Generator - Performance Analysis - Duration

Recursive CTE vs While Loop - Data Generator - Performance Analysis - CPU

Recursive CTE vs While Loop - Data Generator - Performance Analysis - Reads

Recursive CTE vs While Loop - Data Generator - Performance Analysis - Charts

 


 

 

Row Concatenation Scenario:

Concatenates all child keys for each parent key. This uses the data generation query from before to populate the source data and as such the maximum number of child keys is 30 but the actual number varies for each parent key to mimic real world scenarios.

Recursive CTE

SET NOCOUNT ON;

DECLARE

@Iteration INT = 0

, @MaxIterations INT = 10000000

, @Partitioner INT = 1

, @ParentKey INT = 0

, @ChildKey INT = 0;

CREATE TABLE #GeneratedData

(

[ParentKey] INT NOT NULL

, [ChildKey] INT NOT NULL

);

————————————————————————————————————–

WITH

[GeneratedData] AS

(

SELECT

@Iteration AS [Iteration]

, @Partitioner AS [Partitioner]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Iteration

ELSE @ParentKey

END AS [ParentKey]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Partitioner 1

ELSE @Iteration % @Partitioner

END AS [ChildKey]

UNION ALL

SELECT

GD.[Iteration] + 1 AS [Iteration]

, CASE

WHEN ((GD.[Iteration] + 1) GD.[Partitioner]) = 0

THEN ((GD.[Partitioner] + 1) % 30) + 1

ELSE GD.[Partitioner]

END AS [Partitioner]

, CASE

WHEN ((GD.[Iteration] + 1) % GD.[Partitioner]) = 0

THEN GD.[Iteration] + 1

ELSE GD.[ParentKey]

END AS [ParentKey]

, ((GD.[Iteration] + 1) % GD.[Partitioner]) AS [ChildKey]

FROM

[GeneratedData] AS GD

WHERE

GD.[Iteration] < @MaxIterations

)

————————————————————————————————————–

INSERT INTO

#GeneratedData

([ParentKey], [ChildKey])

SELECT

GD.[ParentKey] AS [ParentKey]

, GD.[ChildKey] AS [ChildKey]

FROM

[GeneratedData] AS GD

OPTION

(MAXRECURSION 0);

————————————————————————————————————–

ALTER TABLE #GeneratedData

ADD PRIMARY KEY CLUSTERED

(

[ParentKey]

, [ChildKey]

);

GO

————————————————————————————————————–

CREATE TABLE #SortedData

(

[ParentKey] INT

, [ChildKey] VARCHAR(4000)

, [Order] INT

, [ReverseOrder] INT

PRIMARY KEY

(

[Order]

, [ParentKey]

)

);

CREATE TABLE #ConcatenatedData

(

[ParentKey] INT

, [ChildKeys] VARCHAR(4000)

);

————————————————————————————————————–

INSERT INTO

#SortedData

([ParentKey], [ChildKey], [Order], [ReverseOrder])

SELECT DISTINCT

GD.[ParentKey] AS [ParentKey]

, GD.[ChildKey] AS [ChildKey]

, ROW_NUMBER() OVER (PARTITION BY GD.[ParentKey] ORDER BY GD.[ParentKey], GD.[ChildKey]) AS [Order]

, ROW_NUMBER() OVER (PARTITION BY GD.[ParentKey] ORDER BY GD.[ParentKey], GD.[ChildKey] DESC) AS [ReverseOrder]

FROM

#GeneratedData AS GD;

————————————————————————————————————–

WITH

[ParentKeys] AS

(

SELECT

SD.[ParentKey] AS [ParentKey]

, SD.[ChildKey] AS [ChildKeys]

, SD.[Order] + 1 AS [NextOrder]

, SD.[ReverseOrder] AS [ReverseOrder]

FROM

#SortedData AS SD

WHERE

SD.[Order] = 1

UNION ALL

SELECT

SD.[ParentKey] AS [ParentKey]

, CAST((PK.[ChildKeys] + ‘,’ + SD.[ChildKey]) AS VARCHAR(4000)) AS [ChildKeys]

, SD.[Order] + 1 AS [NextOrder]

, SD.[ReverseOrder] AS [ReverseOrder]

FROM

[ParentKeys] AS PK

INNER JOIN

#SortedData AS SD ON

PK.[NextOrder] = SD.[Order]

AND PK.[ParentKey] = SD.[ParentKey]

)

————————————————————————————————————–

INSERT INTO

#ConcatenatedData

([ParentKey], [ChildKeys])

SELECT

PK.[ParentKey] AS [ParentKey]

, PK.[ChildKeys] AS [ChildKeys]

FROM

[ParentKeys] AS PK

WHERE

PK.[ReverseOrder] = 1;

————————————————————————————————————–

DROP TABLE #GeneratedData;

DROP TABLE #SortedData;

DROP TABLE #ConcatenatedData;


 

 

While Loop

SET NOCOUNT ON;

DECLARE

@Iteration INT = 0

, @MaxIterations INT = 10000000

, @Partitioner INT = 1

, @ParentKey INT = 0

, @ChildKey INT = 0;

CREATE TABLE #GeneratedData

(

[ParentKey] INT NOT NULL

, [ChildKey] INT NOT NULL

);

————————————————————————————————————–

WITH

[GeneratedData] AS

(

SELECT

@Iteration AS [Iteration]

, @Partitioner AS [Partitioner]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Iteration

ELSE @ParentKey

END AS [ParentKey]

, CASE

WHEN @Iteration % @Partitioner = 0

THEN @Partitioner 1

ELSE @Iteration % @Partitioner

END AS [ChildKey]

UNION ALL

SELECT

GD.[Iteration] + 1 AS [Iteration]

, CASE

WHEN ((GD.[Iteration] + 1) % GD.[Partitioner]) = 0

THEN ((GD.[Partitioner] + 1) % 30) + 1

ELSE GD.[Partitioner]

END AS [Partitioner]

, CASE

WHEN ((GD.[Iteration] + 1) % GD.[Partitioner]) = 0

THEN GD.[Iteration] + 1

ELSE GD.[ParentKey]

END AS [ParentKey]

, ((GD.[Iteration] + 1) % GD.[Partitioner]) AS [ChildKey]

FROM

[GeneratedData] AS GD

WHERE

GD.[Iteration] < @MaxIterations

)

————————————————————————————————————–

INSERT INTO

#GeneratedData

([ParentKey], [ChildKey])

SELECT

GD.[ParentKey] AS [ParentKey]

, GD.[ChildKey] AS [ChildKey]

FROM

[GeneratedData] AS GD

OPTION

(MAXRECURSION 0);

————————————————————————————————————–

ALTER TABLE #GeneratedData

ADD PRIMARY KEY CLUSTERED

(

[ParentKey]

, [ChildKey]

);

GO

————————————————————————————————————–

DECLARE

@Iteration INT = 0

, @MaxIterations INT = 0;

————————————————————————————————————–

CREATE TABLE #SortedData

(

[ParentKey] INT

, [ChildKey] VARCHAR(4000)

, [Order] INT

, [ReverseOrder] INT

PRIMARY KEY

(

[Order]

, [ParentKey]

)

);

CREATE TABLE #ConcatenatedData

(

[ParentKey] INT

, [ChildKeys] VARCHAR(4000)

);

————————————————————————————————————–

INSERT INTO

#SortedData

([ParentKey], [ChildKey], [Order], [ReverseOrder])

SELECT DISTINCT

GD.[ParentKey] AS [ParentKey]

, GD.[ChildKey] AS [ChildKey]

, ROW_NUMBER() OVER (PARTITION BY GD.[ParentKey] ORDER BY GD.[ParentKey], GD.[ChildKey]) AS [Order]

, ROW_NUMBER() OVER (PARTITION BY GD.[ParentKey] ORDER BY GD.[ParentKey], GD.[ChildKey] DESC) AS [ReverseOrder]

FROM

#GeneratedData AS GD;

————————————————————————————————————–

SELECT

@Iteration = 2

, @MaxIterations = MAX(SD.[Order]) + 1

FROM

#SortedData AS SD;

————————————————————————————————————–

INSERT INTO

#ConcatenatedData

([ParentKey], [ChildKeys])

SELECT

SD.[ParentKey] AS [ParentKey]

, SD.[ChildKey] AS [ChildKeys]

FROM

#SortedData AS SD

WHERE

SD.[Order] = 1;

————————————————————————————————————–

WHILE @Iteration < @MaxIterations

BEGIN

————————————————————————————————————–

UPDATE

#ConcatenatedData

SET

[ChildKeys] = CAST((CD.[ChildKeys] + ‘,’ + SD.[ChildKey]) AS VARCHAR(4000))

FROM

#ConcatenatedData AS CD

INNER JOIN

#SortedData AS SD ON

CD.[ParentKey] = SD.[ParentKey]

AND SD.[Order] = @Iteration;

SET @Iteration = @Iteration + 1;

END;

————————————————————————————————————–

DROP TABLE #GeneratedData;

DROP TABLE #SortedData;

DROP TABLE #ConcatenatedData;


 

 

Performance Analysis

Recursive CTE vs While Loop - Row Concatenator - Performance Analysis - Duration

Recursive CTE vs While Loop - Row Concatenator - Performance Analysis - CPU

Recursive CTE vs While Loop - Row Concatenator - Performance Analysis - Reads

Recursive CTE vs While Loop - Row Concatenator - Performance Analysis - Charts

Advertisements

, , , , , , , , ,

1 Comment

%d bloggers like this: