Helpful information and examples on how to use SQL Server Integration Services.
One of the post-CTP3 changes for SSIS in SQL Server 2012 is the addition of a user interface for the Pivot transform. This post walks through the new UI, which can be found in the new RC0 preview release.
For this example, we’ll be pulling data from the AdventureWorks sample database. We’ll be grabbing the total sales of all products in the Accessories category (ProductCategoryID = 4), grouped by year.
The data will look something like this
We want the end results to be pivoted to look like this (Total product sales by year):
Add an OLE DB Source transform, and add a connection manager for the AdventureWorks sample database. Paste in the query to retrieve the total product sales by year:
Add a Pivot Transform from the SSIS Toolbox (found under the Other Transforms section by default).
Connect the Source to the Pivot transform. Double click the Pivot transform to open its editor.
The Pivot UI shows a sample pivot table at the top of the form, where you configure the Pivot Key (the column to use for values across the “top” of the table), the Set Key (the column to use for values down the “left” of the table), and the Pivot Value (the column to use for the values in the middle). The bottom of the UI is where you configure the pivot key values.
After mapping my columns to the appropriate keys, the UI looks like this:
The Pivot transform requires you to enter all of the possible Pivot Key values (so it can create output columns for each one). You can manually enter each key here (in our case, we’d have a value for each year that appears in our data set – 2002, 2003, and 2004). Alternatively, we can click the “Ignore un-matched Pivot Key values” checkbox and run the package as is. The pivot transform will output a log message containing all of the key values that we can then copy and paste into the UI.
The Pivot UI should now look like this:
Click OK to save the changes. We’ll add a Row Count transform to the data flow, and connect it to the Pivot’s output. Add a data viewer on the path so we can see the end results.
Run the package, and we can see the pivoted results.
That is far and away the most beautiful, intuitive and interesting DataFlow Component UI I have ever seen. Sending the collected keys to the progress window is brilliant.
So what happens when there's a new entry in the Pivot Key (eg. 2005 data)? I'm assuming it doesn't get automagically added to the output columns. Would this require a development refresh or a fancy post pivot script to catch the "Ignored" values and self-correct the pivot transform?
Yeah - you'd need to update the package. Enter the new value(s) in the box, click the Generate Columns Now button, and it will be added to the list.
This is great. I can't wait to work with the new SSIS at clients. (I hope they upgrade soon)
meh, had this since msaccess 1997. boring.
If microsoft was smart, they would listen to Brent, and the rest of the world. and make pivot dynamic like MSaccess 1997 does.. Matt your answer to Brents question just proves how much microsoft wastes time on 40% completed features.
Creating a pivot in Access/Excel is probably the best way to do it. I typically recommend leaving the data in a tabular format until you actually need to present it in a pivotted form. It's a lot easier to deal with tabular data in an ETL process, because the columns are fixed (just like most database tables). Given that SSIS sets the data flow metadata at design time, it can't dynamically add columns for new values when the package is run - it wouldn't fit into the buffer system we use for quickly moving data through our processing pipeline. Even with this limitation though, Pivot is still quite useful in a lot of customer scenarios.
I agree it would be wasted time if no one uses it, but judging from customer reactions when we demo the feature, I don't think that will be the case.
Thanks for the comments!
Just off the top of my head you could calculate the pivot key as a relative number (year or month etc) and supply the pipeline with a series of set columns. For example current year - year (i.e. 2011-2010 = 1, 2011-2009 = 2....) and define in the pivot a rolling 10years of values (i.e. ,,,,,,,,,,) that way you don't have to worry about setting the columns specifically.
Hamish, perhaps if you took some time to understand how the SSIS dataflow works rather than trolling you'd understand that a dynamic pivot is not possible - just as in any metadata-bound system (e.g. a relational database).
Maybe a better idea would be to suggest an alternative (as Martin did)? For example, my suggestion would be that the Pivot transform could have the option to provide an "Other" column on the output that contains data for any pivot key value that were not planned for at design-time.
7 Years overdue but still welcome.
My 0.0001 cent into the "dynamic" discussion - the example was well-chosen for familiarity of the background, but badly chosen in respect to use-cases. Pivoting aggregates *for presentation* simply does not belong in a data flow. The SSRS Matrix is for that. Massaging Entity-Attribute-Value data stored vertically in a relational table into something a report designer can work with (and where pivoting is not feasible in source SQL due to there being no SQL source or a source in an SQL dialect no-one in-house understands) would be a good use-case.
The UI did not work when i had more than one column for the pivot value. When I try to add second pivaot value, it overwrites the first one.