In the whitepaper, Hardware Sizing a Tabular Solution, we made a reference to Analysis Services Tabular and Non Uniform Memory Access or NUMA. Specifically, Analysis Services Tabular mode is not NUMA Aware. SSAS Tabular mode runs in memory and will load any tabular models into memory on startup. Having the model loaded into memory is what gives SSAS tabular models such impressive query performance. With that said, there are cases where hardware can limit the speed of SSAS Tabular. More specifically, if your Tabular model is running on a system with NUMA hardware you could see a performance impact if Analysis Services needs to access memory located on remote NUMA node. A introduction and discussion on NUMA is outside the scope of this blog post however if you would like to know more, the following article provides some background on the topic http://technet.microsoft.com/en-us/library/ms178144(v=sql.105).aspx
During a recent lab engagement we tested query workloads for some large tabular databases on NUMA and non-NUMA hardware, then on NUMA affinitized instances of Analysis Services. I found that the majority of the queries initially ran two or three times slower on NUMA hardware with a cold cache.
Since publishing the Hardware Sizing guide for Analysis Services many customers have approached me with questions specific to Analysis Services tabular models running on NUMA hardware. In this blog post I will explain how to determine if you are running on NUMA hardware and provide some options for forcing Analysis Services Tabular instances to run on a single NUMA node.
To understand the NUMA architecture for a system there are a couple of easy ways to find out. The first and easiest way to tell is by using task manager
A second option is to use a tool from Sysinternals called CoreInfo CoreInfo will tell you if your hardware is NUMA and it also provides an estimate of what kind of performance impact accessing remote NUMA memory will have.
If you are using a system that is using NUMA and you would like to affinitize Analysis Services to a NUMA node I will explain two options that I have tested and work well.
One way to force Analysis Services to use only 1 NUMA node is to use Windows System Resource Manager or WSRM. To use WSRM you need to first add the feature from the “Add Roles and Features Wizard” Once you have installed WSRM you will then need to create a Resource Allocation Policy and a ProcessMatching criteria to force SSAS to one NUMA node. WSRM is listed as deprecated in MSDN however I haven’t found any compelling reason not to use it and it appears to be the only way to force processor affinitization with SSAS outside of virtualization http://technet.microsoft.com/en-us/library/cc753939.aspx
A second way to force NUMA affinitization to SSAS instances is by using Hyper-V. The benefit of this method as mentioned earlier in the write-up is that you do not need to modify the SSAS Properties in the .ini file to account for the fact that the SSAS will be limited to a specific group of processors and a fraction of the memory.
In addition to setting the NUMA affinity per VM you can also set the following at the Host level. Unchecking the option “Allow virtual machines to span physical NUMA nodes” in theory
Hopefully this post gives you a basic understanding of NUMA and why you should be aware of it when it comes to SSAS tabular mode. If you are experiencing slow cold cache response times from your SSAS Tabular database queries, take a look at your hardware and see if it could be NUMA related. If so, try one of the methods I mention to force affinitization to a NUMA node or test on non-NUMA hardware.
What needs to be changed on the MSMDSRV.ini file?
1 node cluster
2 SSAS(2012) Intances - TAB & AS
4proc*12core=48 or (0-47)
Dell PowerEdgeM910 -- E7540 @2.00GHz with 225Gb RAM
BI Team came to a conclusion after talking to Marco Russo (thank you very much sir for your time) that the mdx query would need to run on 1 proc or NumNode0. They saw a huge difference on query execution from 7.3 sec to ~4sec. This was achi
2)Ran mdx => ~7+sec
1)Connected to TAB instance(rdp)
2)Star Task manager
3) Identify process for TAB; right click Set Affinity
4) Uncheck <All Proc>
5) Selected CPU 0-11(Node0)
6)Clear Cache (xmla)
7)Ran mdx => ~3.6->4.2sec
1)Followed steps 2-4
2)Brought down /up TAB resource offline
Any direction/help would be appreciated
Thanks in advance!