Often time I been asked this quest - what will be the impact of processadd on indexes or aggregations, do I need to recreate aggregations or Indexes after creating Process Update or what is the right approach of handling processing, so thought of writing this article.
Here are the major Process Types in SQL 2005 / SQL 2008 / SQL 2008R2 Analysis Services:
ProcessFull - Simple to digest, applies all objects which means it discards the storage contents of the object and rebuilds them. It applies to descendants objects as well.
ProcessDefault - It does the bare minimum task to bring the objects to a fully processed state which means it build only storage contents which are not currently built. For eg, if you have created new project and just deployed to AS server then ProcessDefault on Database will do full processing or take another case where Partition has fact and aggregation data, then ProcessDefault will only build bitmap indexes.
But in case if you change the binding of an object(eg, change the structure of an object, like added attribute or removed attribute), in that case ProcessDefault on the object will discard all the storage contents and rebuild them.
ProcesData - Applies to dimension, cube, measure group and partition. It discards the storage contents and rebuilds only data.
Dimension it builds Hierarchy and Attributes
Partitions it builds the fact data.
ProcessIndexes - Applies to dimension, cube, measure group, and partition. It requires that the object must already have its "data" built; otherwise, it raises an error. ProcessIndexes preserves the data and rebuilds the "indexes". For dimensions, it builds the bitmap indexes. For partitions, it builds the aggregation data and bitmap indexes. ProcessIndexes is recursively applied to all descendants of the object as well.
ProcessUpdate - Now here is the most confusing part of all. It applies only to Dimension. It is the equivalent of incremental dimension processing in Analysis Services 2000. It sends SQL queries to read the entire dimension table and applies the changes—member updates, additions, deletions.
Since ProcessUpdate reads the entire dimension table, it begs the question, "How is it different from ProcessFull?" The difference is that ProcessUpdate does not discard the dimension storage contents. It applies the changes in a "smart" manner that preserves the fact data in dependent partitions. ProcessFull, on the other hand, does an implicit ProcessClear on all dependent partitions. ProcessUpdate is inherently slower than ProcessFull since it is doing additional work to apply the changes.
Depending on the nature of the changes in the dimension table, ProcessUpdate can affect dependent partitions. If only new members were added, then the partitions are not affected. But if members were deleted or if member relationships changed (e.g., a Customer moved from Redmond to Seattle), then some of the aggregation data and bitmap indexes on the partitions are dropped. The cube is still available for queries, albeit with lower performance.(with ProcessUpdate Flexible aggregations and indexes on related partitions will be dropped)
Which means after ProcessUpdate you need to Process Indexes on the Partitions. The simple approach of dealing with such problem is to do default processing on associated Cubes, default processing will create dropped Aggregations and Bitmap indexes for affected partitions.
If you want to check Aggregations which are dropped after Process Update then follow these task (Applies only to SQL 2008)
1) Take the output (Save it for later comparison) firing this DMV
, CUBE_NAME='Adventure Works'
, DATABASE_NAME = 'AdventureWorksDW2008'
, MEASURE_GROUP_NAME = 'Sales Orders'
2) Process Update Dimension which impacts this Cube
3) Again fire the same command as specified in step 1 you will find few aggregations will be dropped.
For checking Impacted Indexes follow these steps (Applies only to SQL 2008)
1) Take the output (Save it for later comparison) by firing this DMV
SELECT * FROM SYSTEMRESTRICTSCHEMA(
, CUBE_NAME = ‘Adventure Works’c
, DATABASE_NAME = ‘Adventure Works DW’
, MEASURE_GROUP_NAME = ‘Sales Orders’
2) Process Update Dimension which impacts this cube
3) Again fire the same command as specified in step 1 and look for Attribute_Indexed Col, do comparisons with output of Step 1 you will find differences
ProcessAdd - Applies to Dimensions and Partitions, it didn't existing in SSAS 2000. It essentially optimizes ProcessUpdate in scenerios where you are adding only new records to Dimension and facts.
Before using this Processing Type you need to guarantee that only new rows are added. ProcessAdd never deletes or updates existing members. It only adds new members. The users can restrict the dimension table so that ProcessAdd reads only the new rows.
Good News unlike ProcessUpdate - ProcessAdd doesn't drop aggregations and indexes.
This option is not available in GUI you need to fire it using only XMLA (ProcessADD)
ProcessClear - Discards the storage of the object, it applies to all objects and descendants objects as well.
Hey i had a doubt, if the cube is in brows-able state currently and i load some new data into the database, and then do a cube processing and this processing fails, will it make the cube unbrows-able?
Or instead the cube will remain in its previous state?
No, in that case, the Processing Transaction will rollback :)
Thanks for the quick response Karan :)
Hello Karan Gulati, i am new to ssas.Can you please tell me how many types of designs in both SSAS and ssrs.
I have been working in SSAS 2012 and SSIS 2012 for past 4 months. currently I am in small problem.
My cube structure is like below:
some 20 dimensions
15 LINKED measure groups
I need to perform incremental load on daily basis, which requires 10 measure groups out of 15 to be processed.
Before that I perform full process on all dimensions, which causes all cubes and MG to get unprocessed state.
If I start the cube FULL processing, it takes more than 8 hrs to complete which becomes headach for me as well as users.
So I tried to make use of STRUCTURE processing for the cube in which I am performing following steps in same order that I am mentioning:
1. full processing of required dimensions
2. full processing of required partions of the source cube.
3. STRUCTURE processing of the cube to which these MGs are linked.
But what is happening is while processing sturcture of the cube, it kicks off as full processing and again takes lot of time to complete that
Want to know, is there a problem in order which I am following (will processing structure of final cube first and then go for processing partitions of diff MGs work? that is exchanging step position 2 and 3)
Or is there any other way to handle this more efficiently and faster.
Appretiate your help.