CSS SQL Server Engineers

This is the official team Web Log for Microsoft Customer Service and Support (CSS) SQL Support. Posts are provided by the CSS SQL Escalation Services

Query Performance and multi-statement table valued functions

Query Performance and multi-statement table valued functions

  • Comments 3

Lately I worked with a customer to help tune his query involving multi-statement table valued function.   When using table valued functions, you should be aware of a couple of things

First, there are two type of table valued functions which are inline table valued function (Inline TVF) and multi-statement table valued function (multi-statement TVF).    Inline table valued function refers to a TVF where the function body just contains one line of select statement.   There is not return variable.   Multi-statement table valued function refers to a TVF where it has a return table  variable.  Inside the function body, there will be statements populating this table variable.  In the demo at the end of this blog, there are examples of inline TVF and multi-statement TVF.

Secondly, multi-statement TVF in general gives very low cardinality estimate.

If you use inline TVF, it’s like you are just using a view and if it takes parameter, it’s like a parameterized view.   The final SQL plan will not have any reference to the TVF itself.  Instead, all the referenced objects will be in the final plan.

But if you use multi-statement TVF, it’s treated as just like another table.   Because there is no statistics available, SQL Server has to make some assumptions and in general provide low estimate.  If your TVF returns only a few rows, it will be fine.  But if you intend to populate the TVF with thousands of rows and if this TVF is joined with other tables, inefficient plan can result from low cardinality estimate.

In the demo, I created a TVF called tvf_multi_test(), then I join it with other tables with the query below.

select c.ContactID, c.LastName, c.FirstName, Prod.Name,
COUNT (*) 'numer of unit'
from Person.Contact c inner join
dbo.tvf_multi_Test() tst on c.ContactID = tst.ContactID
inner join Production.Product prod on tst.ProductID = prod.ProductID
group by c.ContactID, c.LastName, c.FirstName, Prod.Name

As you can see from the plan here, the estimates are off (resulting from the Table Scan on tvf_multi_test)

image

Solutions

  1. If you don’t plan to join a multi-statement TVF with other tables, you are OK because the low cardinality estimate doesn’t matter.
  2. If you know that your multi-statement TVF will always return small number of rows, you are OK as well.
  3. Use inline TVF when possible:  In the demo, it’s unnecessary to use a multi-statement TVF.  By changing it to inline TVF, the estimates will be accurate.
  4. If you anticipate large number of rows will result from executing the multi-statement TVF and you will need to join this TVF with other tables, consider putting the results from the TVF to a temp table and then join with the temp table.

Demo

/*
Purpose: to demonstrate estimate for multi-statement table valued function
  will have incorrect estimate if large number of rows
setup: it requires sql 2008 AdventureWorks sample database
*/

/*************************************************************
1. creating a TVF to populate from a few other tables
**************************************************************/
use AdventureWorks
go

if OBJECT_ID ('tvf_multi_Test') is not null
    drop function tvf_multi_Test
go

/*
creating multi-statement TVF
*/
create function tvf_multi_Test()
returns @SaleDetail table (ContactID int, ProductId int)
as
begin
insert into @SaleDetail
select ContactID, ProductID from Sales.SalesOrderHeader soh inner join
Sales.SalesOrderDetail sod on soh.SalesOrderID = sod.SalesOrderID
return
end

go

/*************************************************************
2.  exec plan with the multi-statement TVF
**************************************************************/
set statistics profile on
set statistics io on
set statistics time on
go
/*
  the estimate is inaccurate for tvf_multi_Test (always 1 row)
  the plan is not efficient because it drove 121,317 index seek on Product table
  and additional 121,317 seeks on contact table
*/
select c.ContactID, c.LastName, c.FirstName, Prod.Name,
COUNT (*) 'numer of unit'
from Person.Contact c inner join
dbo.tvf_multi_Test() tst on c.ContactID = tst.ContactID
inner join Production.Product prod on tst.ProductID = prod.ProductID
group by c.ContactID, c.LastName, c.FirstName, Prod.Name

go
set statistics profile off
set statistics io off
set statistics time off

go

/*************************************************
3. re-write to use inline table valued function
*************************************************/
if OBJECT_ID ('tvf_Inline_Test') is not null
    drop function tvf_Inline_Test
go
create function tvf_Inline_Test()
returns table
as
return select ContactID, ProductID
    from Sales.SalesOrderHeader soh
        inner join Sales.SalesOrderDetail sod
        on soh.SalesOrderID = sod.SalesOrderID

go

/*****************************************************
4. exec plan for inline TVF
this will get good plan.
In fact, you no longer see the table valued function in
the plan.  It behavies like a view
******************************************************/
set statistics profile on
set statistics io on
set statistics time on
go

select c.ContactID, c.LastName, c.FirstName, Prod.Name,
COUNT (*) 'numer of unit'
from Person.Contact c inner join
dbo.tvf_inline_Test() tst on c.ContactID = tst.ContactID
inner join Production.Product prod on tst.ProductID = prod.ProductID
group by c.ContactID, c.LastName, c.FirstName, Prod.Name

go
set statistics profile off
set statistics io off
set statistics time off

Jack Li |Senior Escalation Engineer|Microsoft SQL Server Support

Leave a Comment
  • Please add 5 and 7 and type the answer here:
  • Post
  • Does this change if the multi-statement table function uses schemabinding? In my experience, I've seen huge improvement using it, on the order of 300% (from 27,000ms to 74ms for the same function after it was rewritten to use  schemabinding). I even tried it without schemabinding after rewritting the multi-statement function and it was still slower than using schemabinding.

  • This is really good information. Thank you.

  • This is very helpful information.   Thanks.

Page 1 of 1 (3 items)