Welcome to MSDN Blogs Sign in | Join | Help
Friday, May 06, 2005 3:06 PM Michael S. Kaplan

Making SQL Server index usage a bit more deterministic

Chris McKenzie blogged about a situation that I will be talking about in my upcoming TechEd 2005 presentations (particularly DBA319!), so I thought I might talk about the issue a bit here.

In his post Note to Self, Chris said:

Don't put ORDER BY Statements in Views.

Delay sorting until you're actually about to use the data.

If you end up sorting differently than the default sort specified by the ORDER BY in the view, it comes as an expensive performance hit. Witness a stored procedure that was breaking a unit test due to a SqlClient.SqlTimeout Exception (took 2.5 minutes to run). Removing the ORDER BY statements from the underlying views brought execution time down to 6 seconds.

Luckily it is not that as bad as all that (or it does not have to be). Though I actually said as much in my International Features in SQL Server 2000 white paper, which was reviewed and approved by the SQL Server International team (and several people on the Engine and at least one architect!):

Issues with defining collation at the column level

How often would you have a database that needs one sort order (for example, Latin1_General) and a column that needs a different one (for example Greek)? Sometimes, this may be crucial, but in other cases if the data in your database does not use a single collation, then it is probably multilingual data that may need to be sorted according to more than one collation. Being able to define multiple collations, each of which can be indexed, allows you to access the Greek data by specifying the Greek collation, and to have this query be an indexed search.

That last clause "and to have this query be an indexed search" is the crux of the matter. In the example provided earlier, using a COLLATE expression in the ORDER BY clause of a query gives you the functionality; however, this will not be an indexed ordering, so it will be slower for large datasets. As it stands, column-level collation makes sense only if you do not have monolingual data in a column, or if you denormalize your database to store different languages in different columns.

It was later pointed out to me that there was a way to make this work in Shiloh (SQL Server 2000), a way which gets much better in Yukon (SQL Server 2005). You can basically create multi-lingual indexes!

It works like this:

Say you have a Table tbl with column c with collation of Latin1_General_CI_AS

select * from tbl
order by c COLLATE Japanese_CI_AS

No Japanese index exists here -- so everything done with this uery will be slower.

create table tbl (c nvarchar(100), c2 as c COLLATE Japanese_CI_AS)
create index itc2 on tbl(c2)

The original query can now use the Japanese index!

Now, In SQL Server 2000, you must explicitly use that field c2 to get the index to work.

But in SQL Server 2005, the work was done in the engine to allow deterministic index usage that would not treat all indexes as "equal" and arbitrarily pick one. If an index exists on a column that matches the characteristice of the query, then the QPE will make intelligent use of that index.

Pretty cool trick, if you ask me. I wish I had known about in several of the applications I did, and before I wrote the white paper!

I will be delving into this sort of item and more in Orlando and in Amsterdam, so if you are going to TechEd 2005 then be sure to sign up for DBA 319: Best Practices for Search in Multilingual Data Sets (Using SQL Server Collation for Sorting and Indexing).

 

This post brought to you by "ยง" (U+00a7, a.k.a. SECTION SIGN)

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: Making SQL Server index usage a bit more deterministic

Friday, May 06, 2005 7:00 PM by Chris McKenzie
Thanks for posting this. I solved my earler problem by delaying sorting until the stored procedure. I identified the problem as being that the ORDER BY clause in my stored proc was sorting the data on a different set of columns than the default sort specified by the underlying view. Are you saying that I could also have improved performance by adding indexes to the columns or column-groups that I will most often be sorting on? I may have to compare and benchmark the two approaches.

This was a very interesting blog. Thanks again.

# re: Making SQL Server index usage a bit more deterministic

Friday, May 06, 2005 8:23 PM by Michael S. Kaplan
Yes, that is what I am saying. :-)

And of course in SQLS 2005 the syntax is more natural since you never have to think about the indexes after you have created them; they will just work....

Glad you liked it! I'll be doing more SQL server posts as I get closer and closer to TechEd.

# Shout out to Michael Kaplan

Monday, May 09, 2005 7:11 AM by Chris McKenzie's Blog

# Achieving case insensitivity

Wednesday, May 11, 2005 12:09 AM by Sorting It All Out
Julia Lerman said just a few hours ago that she is a SQL Server luddite.
I think that is not really...

# Case/kana/accent/width sensitive SQL Server, for testing

Wednesday, May 11, 2005 12:17 AM by Sorting It All Out
A few minutes ago, I posted about Achieving case insensitivity, to deal with how best to make things...

# Not all SQL Server collations are created equal

Sunday, May 15, 2005 10:15 AM by Sorting It All Out
Not all of the collations in SQL Server 2005 (a.k.a. Yukon) or earlier versions have coverage over...

# Performance issues with language specific sorts?

Wednesday, June 07, 2006 7:15 AM by Sorting It All Out
(No, this is not the font post!)
Bryan Murtha asks:

I've read your blog and I was hoping to get out...

# A&P of Sort Keys, part 8 (aka You can often think of ignoring weights as a form of ignorance)

Tuesday, September 18, 2007 3:01 AM by Sorting It All Out

Previous posts in this series: Part 0: The empty string sorts the same in every language Part 1: The

# Making SQL Server operations slower (without explicitly trying)

Tuesday, August 26, 2008 10:04 AM by Sorting it all Out

So I was chatting with Kim and Paul after that .NET DA meeting I mentioned the other day. The one where

Leave a Comment

(required) 
(optional)
(required) 

  
Enter Code Here: Required
 
Page view tracker