Daily Archives: 26 June 2018

Using Stuff to Clean Up String Concatenation

In my last post, I covered the simple, yet handy, STUFF function. It’ll take a string and “stuff it right in there”. Today, I want to cover a real work scenario where I use it quite often. I’ll use the AdventureWorks database (“AWD”) to demonstrate.

Joins create rows. If you run the following query against the AWD, you’ll see the SalesOrderId repeated for each SalesOrderDetail row:

SELECT TOP 150
	soh.SalesOrderID
	,sod.SalesOrderDetailID
FROM 
	Sales.SalesOrderHeader soh
INNER JOIN 
	Sales.SalesOrderDetail sod on sod.SalesOrderID = soh.SalesOrderID

But there are times where you want only 1 row for each SalesOrderID. And so we need a mechanism which can effectively concatenate all the SalesOrderDetailIDs for each SalesOrderID and return it as a single cell, adjacent to the corresponding SalesOrderID.

Consider the following query:

SELECT TOP 10
	soh.SalesOrderID
	,(SELECT CHAR(44) + CAST(sod.SalesOrderDetailID AS NVARCHAR(MAX))
			 FROM Sales.SalesOrderDetail sod
			 WHERE soh.SalesOrderID = sod.SalesOrderID
			 FOR XML PATH('row'), ROOT('result'),TYPE) AS SalesOrderDetailIDs
FROM Sales.SalesOrderHeader soh
ORDER BY soh.SalesOrderID

which yields the following resultset:

Figure 1


You can see that via the use of the XML query syntax and a correlated subquery, I have gone a long way to achieving our goal. All the SalesOrderDetailIDs are in a cell such that there is only 1 row for each SalesOrderID. But we are not interested in well-formed XML, having the XML type (which was done by using the TYPE directive and specifying a ROOT element). We need to peel back the XML elements and extract the value within. We can use the xQuery API to return an NVARCHAR(MAX), which has effectively returned the contents of each XML “row” element.

So now our query looks like this:

SELECT 
	soh.SalesOrderID
	,((SELECT CHAR(44) + CAST(sod.SalesOrderDetailID AS NVARCHAR(MAX))
			 FROM Sales.SalesOrderDetail sod
			 WHERE soh.SalesOrderID = sod.SalesOrderID
			 FOR XML PATH('row'),TYPE, ROOT('result')).value('/result[1]','varchar(max)')) AS SalesOrderDetailIDs
FROM Sales.SalesOrderHeader soh
ORDER BY soh.SalesOrderID

But we’re not quite there. We have 1, small, pesky comma at the beginning of the NVARCHAR, an example of which looks like this:

,1,2,3,4,5,6,7,8,9,10,11,12

This is where STUFF comes in. If we were to deal with this in isolation, it would look like this:

SELECT STUFF(',1,2,3,4,5,6,7,8,9,10,11,12', 1,1, N'')

We are telling STUFF to go to the first character, delete 1 character and replace it with an empty string. That’s how we remove the leading comma. Incorporating that into the main query, we have our final code:

SELECT 
	soh.SalesOrderID
	,(SELECT STUFF((SELECT CHAR(44) + CAST(sod.SalesOrderDetailID AS NVARCHAR(MAX))
			 FROM Sales.SalesOrderDetail sod
			 WHERE soh.SalesOrderID = sod.SalesOrderID
			 FOR XML PATH('row'),TYPE, ROOT('result')).value('/result[1]','varchar(max)'),1,1,N'')) AS SalesOrderDetailIDs
FROM Sales.SalesOrderHeader soh
ORDER BY soh.SalesOrderID

And our resultset:

Figure 2

Note: Rob Farley, in his post Handling special characters with FOR XML PATH, notes that you can:

  1. Remove the ROOT
  2. Change the xQuery to .value('.','varchar(max)')

Rob also makes the point that “it’s always felt strange to me to be returning something which I declare to be XML (using TYPE) without it being well-formed”. I agree with this. It’s a bit weird and makes it harder to see what is going on. But I wanted to point that out for completeness.

Otherwise, that’s a real world example usage of STUFF.