As of September 1, 2013 we decided to remove SQLCAT.COM site and use MSDN as the primary vehicle to post new SQL Server content. This was done to minimize reader confusion and to streamline content publication. MSDN SQLCAT blogs already includes most SQLCAT.COM Content and will continue to be updated with more SQLCAT learnings. You can also find a collection of our work in SQLCAT Guidance eBooks.
To make the transition to MSDN smoother, we are in the process of reposting a few of SQLCAT.COM’s blogs that are still being very actively viewed. Following is reposting of one of the SQLCAT.Com blogs that still draws huge interest. You can find this, and other guidance in the SQLCAT's Guide to Relational Engine eBook.
Also, Follow us on Twitter as we normally use our Twitter handles @MSSQLCAT and @MSAzureCAT to announce news and new content.
Building a large scale relational data warehouse is a complex task. This article describes some design techniques that can help in architecting an efficient large scale relational data warehouse with SQL Server. Most large scale data warehouses use table and index partitioning, and therefore, many of the recommendations here involve partitioning. Most of these tips are based on experiences building large data warehouses on SQL Server 2005.
tks, very good
Something else I'd like to add: instead of deleting directly from a partitioned table, SWITCH the partition with an empty single-partition-sized table and then safely TRUNCATE the latter.
Awesome information! Thanks for the share.
Why not use the date data type directly rather than an integer surrogate key? What disadvantage am I missing?
@Martin Smith Star schema optimisations and possibly, depending on type choosen, size.
How does this change for 2012 / 2014? I notice it's a little out of date (like only supporting 1000 partitions)
Star join optimisation works fine with date datatype key.
@Martin Smith, the date dimension table usually contains a number of extra "derived" fields that enable efficient grouping and aggregating. For example, each date in the date dim will have quarter, week number, month etc that can be used directly without the need to use functions (which slow the queries down).
Thanks for this collection of points which I appreciate, however I beg to differ on Build clustered index on the date key of the fact table since this is not a unique key on a Fact Table for instance you could have many records having the same DateKey? Moreover in most of the datamarts I have used or created, the datekeys are foreignKeys to the Date or time dimension. I would much appreciate if you could please clarify. Thanks my email is email@example.com