Analytical and Transactional MDM

 

I was talking to someone about Analytical and Transactional MDM recently and we realized that while there are quite a few conceptual differences between the two, there’s a significant amount of overlap in the implementation details.  For my purposes, I’ll define Analytical MDM as the processes and tools to manage the dimensions in a data warehouse or OLAP cube and Transactional MDM as the processes and tools to manage the master data used in transactional systems.

 

With few exceptions, the data for the two styles of MDM looks the same.  Transactional MDM might have a few more attributes associated with a given entity because there are things the operational system cares about that that aren’t required for analysis.  An Analytical MDM hub will probably store more hierarchies than a transactional hub because there are generally hierarchies that are interesting in analysis and reporting that the operational system may not care about.  These differences aren’t incompatible and it probably makes sense to use the same MDM hub to store both analytical and transactional master data because it will be much easier to manage in one place than if you had separate hubs for the two uses of the same entities.  This seems like a good argument for looking for an MDM hub solution that isn’t limited to only a single style of master data.

 

Another difference between the two styles of MDM might be in the way data is loaded and published.  Loading an analytical hub is usually done in batches – maybe once a day while most transactional style systems are loaded an entity at a time as the entities are created or modified in the operational systems.  Other than this, the transformations, duplicate checks, business rules checks, etc. involved in loading master data into a hub are the same in either style.  This means that other than the batch size (one in the case of transactional and “N” in the case of analytical) there’s really not much difference in the load processes.

 

Publishing data is different in the two styles of MDM but not in an incompatible way.  Transactional MDM data is generally published in a “push” method where changes to the master data are pushed out to the operational systems but there are many applications that either don’t expose the required interfaces or a run by groups that won’t allow data to be pushed into their system so a “pull” style publication is required.  Analytical MDM data is generally pulled from the hub when the OLAP cube is built or when the data warehouse is updated.  This means that an MDM hub will usually have to support both push and pull style synchronization with operational systems and warehouses so again there’s not a significant difference between the requirements of transactional and analytical MDM.

 

 

I’ve heard from many people that a transactional MDM system need higher performance and scaleability than an analytical system but I’m not sure that’s necessarily true.  The data quantities are going to be identical in either case because if you add 10,000 customers a day to your operational systems, you will need to load 10,000 into the MDM hub whether the data is used in the operational system or the warehouse.  In fact, the loading is probably spread out over the business day for transactional MDM while analytical MDM loading probably has to fit into a batch window at the end of the day so the analytical may actually need more loading performance.  Publishing to several operational systems will take more processing than publishing to a data warehouse but the transformations, encoding, and messaging required can be easily unloaded to a separate server so this doesn’t affect MDM hub performance much.  In some architectures latency might be a bigger issue for transactional MDM than analytical MDM so shorter processing lengths and asynchronous business rule enforcement might be necessary.

 

So what does this all mean?  My take on it would be to look for an MDM solution that can support both transactional and analytical styles.  I think in most cases the logical progression would be to start with analytical MDM to master the data models, rules, technology and stewardship required to manage your master data in a less mission-critical environment.  Once you have achieved some successes in analytical MDM, you can use the same data, models and processes to manage the master data for your transactional systems by just adding the publishing logic to push the master data into the operational systems.