Building Great Cubes: Tip 2 - Time Matters, part 1

Published 04 April 08 08:23 PM | psprag 

One common theme I see in Analysis Services cubes as I visit partners and customers is large time dimensions. In the last year alone I have seen at least 10 cubes that have had time dimensions that extended past 2010. None of these cubes had facts associated with that range. I often seen cubes with time dimensions that start as early as 1901 and one that started with the year 0. There are often facts that appear to be associated with this historical period. Why is this a concern? Two reasons stand out, User Experience Complexity and Data Correctness.

User experience is an issue that I am going to address often on this blog. There are some great tools in the MS BI but their value to the end user is driven to a great deal by the quality of the cube. Large time dimensions are a simple but common example of poor cube design effecting user experience. When navigating the time dimension the user will see a large number of years that have no purpose. If they select any of these they receive a resulting empty row or column. While there is nothing "wrong" about this it does add unneeded clutter and complexity to the user experience. The goal should be clarity, only including members that have some relation to the data.

Does this mean a cube designer should only include members that directly link to a fact as a best practice? No. It means the dimension should include only those members that give the user context when understanding the data. It is certainly reasonable to include periods when there was no data captured because there was no business activity. Including members for holidays or weekends is a classic example where we include members without facts. Likewise including members for 1990 in a sales cube for a software firm founded in 2002 isn’t meaningful.

Data correctness is a more critical problem. When a fact in links to a nonsense date and you expose that to your business users you are advertising that you have a data problem. Do we really expect our business users to make decisions based on a cube with obvious data errors even when those errors are trivial? I am always surprised when I visit a customer and I see data in 1907 or 1906. Invariably I am told that the data warehouse is wrong. The suggestion is that the business users are "ok" because they understand that 1906 probably means 2006. This is one of the reasons that cube designers add or allow large time dimensions. This allows the cube to be processed with dirty data. It is a much better practice to limit the dimension members to a reasonable range of dates and deal with the bad data before proceeding. Better yet, a limited range of dimension dates AND an SSIS package that checks for out of range data as part of the ETL process.

Limiting the time dimension to a range of dates reasonable for your business use is just one simple step towards better time dimensions but the end result is certainly worth it. Over the next few weeks I will have a number of additional posts to discuss more ways to better handle time in SSAS cubes.

Remember, in the Microsoft Analytics environment the SSAS cube is a business document for all intent and purposes. Time and any other dimensions should be prepared and formatted with the care you would reserve for any end user documentation. 

Pete Sprague

Filed under: , ,

Comments

# Wade Dorrell said on June 6, 2008 4:49 PM:

Good article Pete... thanks. - Wade

# Derrick Lau said on August 11, 2008 3:44 PM:

I am writing a cube that shows the number of units assigned to a customer group on either a monthly or weekly basis.  However, the transactional table that contains this information shows the unit id, the customer it is assigned to, and the date the assignment is effective and the date the unit was assigned to yet another customer.  How should I create my dimensions and measures?  Currently I created a customer and time dimension, and a measure being the total number of units, but I cannot get the number of units assigned to each customer per month/per week.

Anonymous comments are disabled

Search

This Blog

Syndication

Page view tracker