Performance Benefits of using Expression over User Defined Functions

Performance Benefits of using Expression over User Defined Functions

  • Comments 4

Summary

No matter how simple a UDF is, there's a large performance penalty paid when they're used.  This penalty shows up as poor query execution time when a query applies a UDF to a large number of rows, typically 1000 or more.  The penalty is incurred because the SQL Server database engine must create its own internal cursor like processing.  It must invoke each UDF on each row.  If the UDF is used in the WHERE clause, this may happen as part of the filtering the rows.  If the UDF is used in the select list, this happens when creating the results of the query to pass to the next stage of query processing. It's the row by row processing that slows SQL Server the most. In many cases an expression can replace the functionality that the UDF provides and offer a significant performance benefit over a UDF (see examples below).

More Information

SETUP FOR THE TEST
======================
--- Create Test Table
CREATE TABLE UDF_parameters (param1 int, param2 int, param3 int, param4 int)
-- Create Clustered Index
CREATE CLUSTERED INDEX CLU1 ON UDF_parameters(Param4)
--Code for the UDF.
CREATE FUNCTION dbo.divide_func(@numerator as int, @denominator as int, @default as float)
RETURNS float
BEGIN
                if @denominator = 0
                                RETURN @default
                if @numerator = 0
                                RETURN @default
                RETURN @numerator/@denominator
END
--- Insert records in the table ----
declare @count int
set @count =1 -----> This will insert 1 million records to the table
while @count <1000001
begin
                INSERT INTO UDF_parameters values(@count+3,@count+2,@count+1, @count)
                set @count = @count+1
end
Using UDF in Query SELECT list
SET STATISTICS TIME ON
GO
select PARAM1, DEVIDED_VALUE = dbo.divide_func(param2,param3,CAST(param4 as float)), param4 from UDF_parameters
GO
SET STATISTICS TIME OFF
GO
Rows Executes StmtText
------ -------- ---------------------------------------------------------------------------------------------------------------
500000 1 select PARAM1, DEVIDED_VALUE = dbo.divide_func(param2,param3,CAST(param4 as float)), param4 from UDF_parameters
500000 1 |--Compute Scalar(DEFINE:([Expr1004]=[TEST1].[dbo].[divide_func](parameter... )
500000 1 |--Clustered Index Scan(OBJECT:([TEST1].[dbo].[UDF_parameters].[CLU1]))
If you notice the plan, we would see that the Function call is being made, for all the rows being returned by the Clustered Index Scan.
If you take a profiler trace while running this statement, you would notice the multiple executions of the UDF. (Image at the end of the Page)
Query with UDF in Where Clause
SET STATISTICS TIME ON
GO
select PARAM1,param2,param3, param4 from UDF_parameters where dbo.divide_func(param2,param3,CAST(param4 as float)) = 1
GO
SET STATISTICS TIME OFF
GO
Rows Executes StmtText
------ -------- -------------------------------------------------------------------------------------------------------------------------------------------------------------500000 1 select PARAM1,param2,param3, param4 from UDF_parameters where dbo.divide_func(param2,param3,CAST(param4 as float)) = 1
500000 1 |--Filter(WHERE:([TEST1].[dbo].[divide_func]([TEST1].[dbo].[UDF_parameters].[param2],[TEST1].[dbo].[UDF_parameters].[param3],CONVERT(float(53),[TEST1].[dbo].[UDF_parameters].[param4],0))=(1.000000000000000e+000)))
500000 1 |--Clustered Index Scan(OBJECT:([TEST1].[dbo].[UDF_parameters].[CLU1]))
      Notice a similar behaviour in multiple executions of the UDF, when we have the UDF in the where clause. (Image at the end of the Page)
Using Expression In Query
SET STATISTICS TIME ON
GO
SELECT PARAM1, 
DEVIDED_VALUE =
                   CASE
                                WHEN PARAM3=0 THEN param4
                                WHEN param2=0 THEN param4
                                ELSE param2/param3
                   END ,
PARAM4
from UDF_parameters
GO
SET STATISTICS TIME OFF
GO
Rows Executes StmtText
------ -------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
500000 1 SELECT PARAM1,DEVIDED_VALUE = CASE WHEN PARAM3=0 THEN param4 WHEN param2=0 THEN param4 ELSE param2/param3 END ,PARAM4 from UDF_parameters
0 0 |--Compute Scalar(DEFINE:([Expr1004]=CASE WHEN [TEST1].[dbo].[UDF_parameters].[param3]=(0) THEN [TEST1].[dbo].[UDF_parameters].[param4] ELSE CASE WHEN [TEST1].[dbo].[UDF_parameters].[param2]=(0) THEN [TEST1].[dbo].[UDF_parameters].[param4] ELSE [TEST1].[dbo].[UDF_parameters].[param2]/[TEST1].[dbo].[UDF_parameters].[param3] END END))
500000 1 |--Clustered Index Scan(OBJECT:([TEST1].[dbo].[UDF_parameters].[CLU1]))
Notice, how using a case expression, has only on execution.
Comparison of the time taken by the Different Statements.

Rows Returned

WITH UDF in Select List

UDF in Where Clause

With Expression

0.5 Million

26 seconds

26 seconds

4 seconds

1.0 Million

52 seconds

52 seconds

8 seconds

1.5 Million

86 seconds

87 seconds

12 seconds

Snapshots of Profiler Traces
==============================
As we can see, the query when using the function is slower than the one using the CASE expression.
In this case I was using very simple parameters, but if we have some complex parameters (like those involving mathematical operation) we would see the response time increase even more.

by
Sourabh Agarwal
Technical Lead, Microsoft SQL Server

Leave a Comment
  • Please add 4 and 7 and type the answer here:
  • Post
  • Is there a performance difference bewteen calling a t-sql UDF & a sql-clr UDF for a simple non data access UDF?

  • Yes, there would a slight difference in the performance of t-sql UDF & a sql-clr UDF. The reason being the multiple switches which need to be made between SQL engine and the CLR runtime. With t-sql UDF everything is within the SQL Engine and no addtional switches are required.

  • Could you fix a link to the profiler image?

  • Question is answered here:

    stackoverflow.com/.../database-function-vs-case-statement

Page 1 of 1 (4 items)