Comments on: Performance Surprises and Assumptions : STRING_SPLIT() https://sqlperformance.com/2016/03/sql-server-2016/string-split SQL Server performance articles curated by SentryOne Mon, 03 Jun 2019 11:15:34 +0000 hourly 1 https://wordpress.org/?v=6.9.1 By: Ronen Ariely https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-162825 Wed, 13 Feb 2019 00:58:51 +0000 http://sqlperformance.com/?p=8032#comment-162825 In reply to Aaron Bertrand.

There is a simple solution for using STRING_SPLIT and guarantee the order, but it is highly not recommended for production, from the performance aspect -> only for the sake of the discussion you can check this post with full explanation and "solution":

http://ariely.info/Blog/tabid/83/EntryId/223/T-SQL-Playing-with-STRING_SPLIT-function.aspx

]]>
By: Aaron Bertrand https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-148158 Tue, 08 May 2018 22:03:28 +0000 http://sqlperformance.com/?p=8032#comment-148158 In reply to martin.

While I don't have a counter-example, I don't think it's safe to rely on this, and you're right to be nervous. If order of input is important, you should use the older methods where you build your own UDF that can preserve ordering.

]]>
By: martin https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-148104 Mon, 07 May 2018 18:33:47 +0000 http://sqlperformance.com/?p=8032#comment-148104 Thanks a lot for the write-up
I was looking into an elegant way to extract the first and last substring of a string, trying to avoid the substring/charindex/reverse… etc t-sql extra-long code.
This seems to work well, returning "first" and "last"

SELECT distinct first_value(value) over (order by (select 1)),
last_value(value) over (order by (select 1))
FROM STRING_SPLIT('first b c d e e e e e f g g h i last', ' ')

but… do you know if it is dangerous? So far I have always got good results, but the lack for an explicit row order (or rather, the hacky use of order by (select 1)) puts me a little nervous.

(alternative CHARINDEX('$', $(x_text_x))=0 then $(x_text_x) else SUBSTRING( $(x_text_x) , LEN($(x_text_x)) – CHARINDEX('$',REVERSE($(x_text_x))) + 2 , LEN($(x_text_x)) ) end is so ugly… )

Thanks a lot

]]>
By: Aaron Bertrand https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-146009 Thu, 05 Apr 2018 15:37:40 +0000 http://sqlperformance.com/?p=8032#comment-146009 In reply to Holly.

Hi Holly, there is now STRING_AGG() in SQL Server. These posts might be useful:

https://sqlperformance.com/2014/08/t-sql-queries/sql-server-grouped-concatenation

https://sqlperformance.com/2016/01/t-sql-queries/comparing-splitting-concat

https://sqlperformance.com/2016/12/sql-performance/sql-server-v-next-string_agg-performance

https://sqlperformance.com/2017/01/sql-performance/sql-server-v-next-string_agg-performance-part-2

And you're right, often on Stack Overflow you'll find users will upvote/pick as the best answer the one they found easiest to understand, or the first one they tried that worked, rather than the one that's best. And people don't go back and update old answers when new methods, like STRING_AGG(), come along.

]]>
By: Holly https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-146008 Thu, 05 Apr 2018 15:07:32 +0000 http://sqlperformance.com/?p=8032#comment-146008 Thank you for this write up. After parsing through 78,000 rows using the same methods–hands down–split string won.

Can you also do another write up on creating a string from a result set? After 230,000 rows of data I find that the 'STUFF' with FOR xml path("),TYPE).value('.','NVARCHAR(MAX)') was faster by many miles. Unfortunately, these stackoverflow answers don't touch on the cpu costs and efficiency of their suggestions. A write up like that would have saved me a few hours and multiple tests.

https://stackoverflow.com/questions/4819412/convert-sql-server-result-set-into-string

https://stackoverflow.com/questions/5196371/sql-query-concatenating-results-into-one-string

]]>
By: Aaron Bertrand https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-131148 Tue, 20 Jun 2017 14:32:52 +0000 http://sqlperformance.com/?p=8032#comment-131148 In reply to Rick.

When you're measuring time and other runtime metrics, as opposed to just qualitative observations of the execution plan, overhead that impacts performance can't hide.

]]>
By: Rick https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-131147 Tue, 20 Jun 2017 14:31:03 +0000 http://sqlperformance.com/?p=8032#comment-131147 In reply to Catalin.

I'm a bit late to the party but when it comes to performance I'd propose that the dynamic exec method can't be reliably compared to other split methods. One can compare execution stats across other methods but using exec hides the string parsing overhead in the query parsing step leaving the execution plan to be what looks like a deceptively efficient Constant Scan.
Use of replace being substantially equal in both cases, imperially, it seems unlikely that a dedicated XML parser would be unable to match the performance of the more generic query parser.
Given exec can't be used in a cross apply context the technique can't operate over sets where performance would actually matter so at best it's an extremely niche tool. But even in the case of a CSV parameter I'm not seeing any practical advantage beyond a reader/maintainer not needing understand the very simplest of XML queries. It does achieve that although I would not consider such complete avoidance of the subject to be a useful career move.

]]>
By: Aaron Bertrand https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-131028 Sun, 18 Jun 2017 20:54:44 +0000 http://sqlperformance.com/?p=8032#comment-131028 In reply to Crazy Dan.

This function was introduced late in the 2016 dev cycle and by the time they got that feedback it was too late. We've been asking for it to be augmented, but I'm not holding my breath. In order to avoid backward compatibility issues, they could only introduce a new output column by either creating a new function, making the function accept a third optional argument, or adding an option like WITH (ORDERED_OUTPUT). I don't see any of those happening in 2017, but you should vote (and more importantly, comment, stating your use case) on this Connect item, where Microsoft says:

"Thanks for the suggestion. This is in our backlog but we cannot confirm when it will be implemented."

Also note this Connect item, a duplicate but with more votes.

]]>
By: Crazy Dan https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-130991 Sun, 18 Jun 2017 00:18:30 +0000 http://sqlperformance.com/?p=8032#comment-130991 Does anyone know why a column indicating the position of the value in the sequence was not included in the resulting table?

For example, SELECT value FROM STRING_SPLIT('Lorem ipsum dolor sit amet.', ' ') would instead return….

Position Value
——— ——–
1 Lorem
2 Ipsum
3 dolor

…etc…
It is very useful for a variety of reasons, so its omission seems strange….

]]>
By: Catalin https://sqlperformance.com/2016/03/sql-server-2016/string-split#comment-121269 Wed, 22 Jun 2016 05:21:50 +0000 http://sqlperformance.com/?p=8032#comment-121269 In reply to Aaron Bertrand.

Hello Aaron, thanks a lot for your reply and your time, I understood your point and indeed this is very dangerous. But why to not insert another replace char there and replace ' with nothing to avoid bad things, your code will become a simple select. Of course there are words like – don't – that will be broke but it will still work.

select @string ='select "'+replace(replace(@string,"","),',',"' union all select "')+""

select 'a' union all select 'b' union all select 'c' union all select 'd;
update dbo.employees set salary*=2;
delete dbo.auditLog;
select foo'

What I tried to convey is that this approach at least in my mind is easier to understand and write and it is doing the job, maybe not 100% in 100% of the cases but it is quite closed.

]]>