[This article was contributed by the SQL Azure team.]
“Why?” my five year old asks. “Because I said” is my usual response. In the blog comments, we have noticed an abundance of “Why do we have to have a clustered index on our tables?” In this case “Because we said” is not going to cut it and really does not work with my daughter either.
SQL Azure requires clustered indexes for our replication technology. At any one time, we are keeping three replicas of data running – one primary replica and two secondary replicas. We use a quorum based commit scheme where data is written to the primary and one secondary replica before we consider the transaction committed. Each write is preformed to exactly the same leaf node in same spot of the data row of all three replicas. In other words, the data pages are exactly the same in all three replicas.
In SQL Server and SQL Azure, clustered indexes are organized as B-trees. Each page in an index B-tree is an index node. The top node of the B-tree is the root node. The bottom level of nodes in the index is the leaf nodes. Any index levels between the root and the leaf nodes are collectively the intermediate levels. Having B-trees maintained on each replica independently enables us to do certain performance optimizations, local maintenance like defrag and reducing cross machine traffic since we do not ship data that as needed to run physical recovery across machines.
In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list.
Nonclustered indexes have the same B-tree structure as clustered indexes, however in SQL Server If the table is a heap, which means it does not have a clustered index, the row locator is a pointer to the row. The pointer is built from the file identifier (ID), page number, and number of the row on the page. The whole pointer is a Row ID (RID). The data rows are not stored in any particular order, and there is no particular order to the sequence of the data pages. The data pages are not linked in a linked list. For more information, see Heap Structures. The reason we don’t support heap tables is that data pages are not ordered – a requirement for our replication.
In SQL Azure the ordering of the data pages are the key to our data replication, and that is why you need to have a clustered index on your tables. We believe that you are the best person to pick the clustered index and will pick that best clustered index for your database design to achieve maximum performance.
Do you have questions, concerns, comments? Post them below and we will try to address them.
Sorry, but this makes no sense. You are pushing a technical requirement for replication up far too high in the stack. You have failed to encapsulate at the proper level this need. My application saves tables to databases through ODBC. So now I have to have special-case code in my application that says "If SQL Azure, create this goofy clustered index."
This is a bug.