Today I got in to some heavy weight TSQL tuning. This time the target was a legendary sproc that was taking 3 mins and now I’m about to call it a day when this giant SP is eating only 16 sec. Not excellent but not bad at this point. Here are some notes…
One of the big performance hits occur when you must process individual rows one at a time instead of in set. For instance, let’s say you have a table with a column that has comma delimited values. Now you want to split these values in each cell and create a new table which would have N rows for each row in original table – one for each spitted value. The Internet is littered with dozen ways to split strings in TSQL, some even uses CTEs (not a good idea because there are lots of gotchas like max recursion limit). So far the best way is to use SQL CLR with code like below. Its as fast as any native TSQL juggling, if not faster. However most important thing here is not SQL CLR but how you use this table valued function and here’s the secret: The best bang for the performance you would get is using CROSS APPLY (or OUTER APPLY) with table valued UDF.
public partial class UtilityFunctions { [Microsoft.SqlServer.Server.SqlFunction(FillRowMethodName = "FillRow", TableDefinition="StringPart nvarchar(max)", IsDeterministic=true, IsPrecise=true, SystemDataAccess=SystemDataAccessKind.None)] public static IEnumerable ClrSplitString(SqlString sqlStringToSplit, SqlChars delimiter, SqlBoolean removeEmptyEntries) { if (!string.IsNullOrEmpty(sqlStringToSplit.Value)) { return sqlStringToSplit.Value.Split(delimiter.Value , (StringSplitOptions)(removeEmptyEntries ? StringSplitOptions.RemoveEmptyEntries : StringSplitOptions.None)); } else { return null; } } public static void FillRow(object obj, out SqlString splittedString) { if (obj != null) splittedString = new SqlString((string)obj); else splittedString = SqlString.Null; } }