TSQL_language's WebLog

TSQL: compiler under the covers

Here is an interesting comment from chrisbro about some features he finds sorely missing in TSQL:

* Can't specify a parameter to be used as an IN clause
* Can't easily specify parameters to be used in the ORDER BY clause; especially painful to do multiple order by clauses

I actually hear quite a bit of requests in the same "parameterization" vein, such as "how come I cannot specify a table name as a parameter?". So let's look under the hood for a moment. Although it looks to the user as a "pure interpreter", TSQL is actually a compiler. During the compilation phase, all tables/columns are fetched from the catalogs, the statement gets type checked, and, most importantly, the query processor comes up with a plan for executing the query. Compilation is a pretty expensive process, so we keep the resulting plan in cache and reuse it if we see the same statement again. So, here is a rule: during compilation, we cannot look at the actual variable or parameter values. Otherwise, we would have to compile the query every time it is executed, because the parameter value may be different. So, we restrict parameters to the relatively "benign" positions where they cannot really influence the data types involved in a statement and the execution plan. This allows us to compile a statement once and reuse the plan for different parameter values (of course I am simplifying a bit here, let me know if more details would be of interest). Now, if you look at, say, ORDER BY clause, the execution plan is actually different (in some cases radically different) depending on what columns you sort on. That's why we don't allow you to parameterize this aspect of the query (IN clause is roughly the same deal). Good workarounds for parameterizing the IN clause are described here. When you want to parameterize your ORDER BY clause, you have multiple options, and which one you choose depends on how flexible you want to be on your ordering options and what your performance requirements are. The most performant solution would be to write several separate queries, each one with the particular flavor of ORDER BY that you want, and then choose one of them using IF statement. If you want to be absolutely flexible, you can use dynamic SQL (i.e. compose your statement on-the-fly and execute it using EXEC). This is slower, and requires you to be careful to avoid SQL injection attacks if you take the ORDER BY list directly from the user input (tip: check that this is indeed a simple ORDER BY list before composing the statement).

So, why cannot SQL Server just implement the above "workarounds" as features and allow parameterizing of ORDER BY clause for the user? One reason is performance. Any generic solution would require essentially generating dynamic SQL under the covers. This would incur serious performance overhead in excess of "regular" parameterization, which would be "unexpected" as opposed to the explicit use of dynamic SQL where you are aware that it has perf penalty. Another reason is that we just cannot implement all features in the world, we have to prioritize them. So, keep posting your wishes, and they may come true in upcoming releases.

This posting is provided "AS IS" with no warranties, and confers no rights.

Published Friday, September 24, 2004 12:00 PM by TSQL_language

Comments

 

Martin said:

The big problem I often have to deal with is linked databases. Is there any way of paramaterizing database links. That is can I pass a database name in as a paramater and then link a query using the database. Every company I have worked at in the last 5 years has had this problem. The only way we have figured out to do this is to use dynamic SQL.
If you or anyone else has an answer to this question it would be great.
September 24, 2004 1:15 PM
 

Grumpy Today said:

Your explanation of why things are now as they are is clear. And some very nice suggested workarounds, too.

Unfortunately, then you just go into excuses -- basically about performance, but you correctly identify "expectation" as the root of that problem. Expectation is something that you can cure by education, not technology.

Ultimately the application programmer is responsible for the total performance of his/her application ... It's not your responsiblity if they choose low-performant solutions (though better and better performance measurement tools are cool). But right now, these app devs are just finding ways to do the "parameterization" before submitting the query to SQL, so it still ends up doing multiple compilations. No performance gain is realized, by the lack of this feature.

This also results in each app dev creating their own custom solution to this problem. Why not do it once, put it in the SQL product, and be done with it.

So, please, go ahead and implement the tablename and orderby parameters. After all, the reason people are asking for this is because their current expectations are not being met.

Perhaps your compiler could just keep multiple execution plans (instead of just one) for queries with these special parameters.
September 24, 2004 1:29 PM
 

Steve said:

The inability to use parameters in an order by clause have caused my group to give birth to some horrible, horrible hacks that should have never existed. If you told me that the query would have to be recompiled between rounds - my team is smart enough to know when its important to have that cached query and when we can live without it.

But I'd really like to at least have the option.

-Steve
September 24, 2004 2:27 PM
 

2005 SP1 said:

Would this be too big of a change before SQL 2008 is released?
September 24, 2004 3:19 PM
 

Anatoly Lubarsky: Weblog said:

September 25, 2004 12:23 PM
 

Anatoly Lubarsky: Weblog said:

September 25, 2004 12:28 PM
 

Ron Buckton said:

Although slower, this is how I usually get around parameterizing the order by clause when I have to do sorting on the server side.

(see http://blogs.chroniclesdesign.com/rbuckton/articles/SQL2000ServerPaging.aspx)

...
ORDER BY
CASE
WHEN @SortBy = 'Author'
THEN Book.Author

WHEN @SortBy = 'PublishDate'
THEN CONVERT(nvarchar(255), Book.PublishDate)

WHEN @SortBy = 'Publisher'
THEN Book.Publisher

ELSE Book.Title
END ASC
September 27, 2004 11:28 AM
 

chrisbro said:

Thanks for the extra detail - I'm actually using a pair of somewhat ugly workarounds right now to support this.

Ultimately, I completely agree with Steve and Grumpy. If the language won't allow us to use parameters because it would force query recompilations, then we find ways to meet our performance and security needs that still ultimately require recompilations. We're forced to find clever SQL hacks that ultimately are just as nonperformant, with an added dev/test cost as we reinvent a kludgy wheel.

For the IN clause, my proc takes 20 parameters, @AT0..@AT19, and executes the statement WHERE AssetType IN (@AT0, @AT1, ..., @AT19). I'm building a somewhat generic API, so I can't possibly know about all the possible AssetType values in advance. Constructing any form of dynamic SQL is not acceptable. The IN (big list) approach is a bit faster than joining on a temp table or a subquery.

To handle ORDER BY, I pass in a string such as'title asc, assetid desc' then I run a query like this:

ORDER BY
case when 'title asc' = ltrim(rtrim(substring(@SortProperty, 1, patindex('%,%',@SortProperty)-1)))
then title else null end asc,
case when 'title desc' = ltrim(rtrim(substring(@SortProperty, 1, patindex('%,%',@SortProperty)-1)))
then title else null end desc,
case when 'assetid asc' = ltrim(rtrim(substring(@SortProperty, 1, patindex('%,%',@SortProperty)-1)))
then assetid else null end asc,
case when 'assetid desc' = ltrim(rtrim(substring(@SortProperty, 1, patindex('%,%',@SortProperty)-1)))
then assetid else null end desc,

case when 'title asc' = ltrim(rtrim(substring(@SortProperty, patindex('%,%',@SortProperty)+1, 100)))
then title else null end asc,
case when 'title desc' = ltrim(rtrim(substring(@SortProperty, patindex('%,%',@SortProperty)+1, 100)))
then title else null end desc,
case when 'assetid asc' = ltrim(rtrim(substring(@SortProperty, patindex('%,%',@SortProperty)+1, 100)))
then assetid else null end asc,
case when 'assetid desc' = ltrim(rtrim(substring(@SortProperty, patindex('%,%',@SortProperty)+1, 100)))
then assetid else null end desc


Something else I noticed while working on this: every time I run a batch that creates a temp table, adds a few rows, does a query, then drops the temp table, the number of reads performed by the batch increases by 7-8. If I reconnect it starts over and begins incrementing by 7-8 again. This suggests that, if my app was using connection pooling, that performance would degrade over time.
September 28, 2004 4:19 PM
 

Microsoft » Blog Archives » In this series of articles I want to said:

March 3, 2007 10:02 AM
 

Microsoft » Blog Archives » … a message must already exist before the said:

June 19, 2007 2:37 PM
 

Microsoft » Blog Archives » Here is an interesting comment from chrisbro about said:

June 30, 2007 7:03 PM
 

Microsoft » Blog Archives » Here is an interesting comment from chrisbro about said:

September 1, 2007 1:47 PM
 

tsql said:

April 29, 2008 12:42 AM
Anonymous comments are disabled

© 2008 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker