Forcing NUMA Node affinity for Analysis Services Tabular databases

Forcing NUMA Node affinity for Analysis Services Tabular databases

Rate This
  • Comments 9

In the whitepaper, Hardware Sizing a Tabular Solution, we made a reference to Analysis Services Tabular and Non Uniform Memory Access or NUMA. Specifically, Analysis Services Tabular mode is not NUMA Aware. SSAS Tabular mode runs in memory and will load any tabular models into memory on startup. Having the model loaded into memory is what gives SSAS tabular models such impressive query performance. With that said, there are cases where hardware can limit the speed of SSAS Tabular. More specifically, if your Tabular model is running on a system with NUMA hardware you could see a performance impact if Analysis Services needs to access memory located on remote NUMA node. A introduction and discussion on NUMA is outside the scope of this blog post however if you would like to know more, the following article provides some background on the topic  http://technet.microsoft.com/en-us/library/ms178144(v=sql.105).aspx 

During a recent lab engagement we tested query workloads for some large tabular databases on NUMA and non-NUMA hardware, then on NUMA affinitized instances of Analysis Services.  I found that the majority of the queries initially ran two or three times slower on NUMA hardware with a cold cache.

Since publishing the Hardware Sizing guide for Analysis Services many customers have approached me with questions specific to Analysis Services tabular models running on NUMA hardware.  In this blog post I will explain how to determine if you are running on NUMA hardware and provide some options for forcing Analysis Services Tabular instances to run on a single NUMA node.

To understand the NUMA architecture for a system there are a couple of easy ways to find out.  The first and easiest way to tell is by using task manager

  1. Open Task Manager and click on the Performance Tab
  2. Next click on CPU
  3. Right Click the Graph area and select Change graph to \ NUMA nodes
  4. On a system with no NUMA hardware you will notice NUMA nodes is greyed out.   In the screenshot below, the system on the left is from a server with 4 NUMA nodes and the system on the right is from a machine with no NUMA hardware

A second option is to use a tool from Sysinternals called CoreInfo  CoreInfo will tell you if your hardware is NUMA and it also provides an estimate of what kind of performance impact accessing remote NUMA memory will have.

  1. Download and extract CoreInfo from the following link CoreInfo
  2. Open an administrative Command Prompt and navigate to the directory where you extracted CoreInfo
  3. Run the following command  CoreInfo.exe
  4. In the output you will see a section called “Logical Processors to NUMA Node Map”.  This will tell you if you are using NUMA and if so how many NUMA nodes.  If the System is NUMA then you will also see the approximate cross-NUMA node access cost which gives you an idea of what type of performance hit you may see when referencing memory on remote NUMA nodes

 

If you are using a system that is using NUMA and you would like to affinitize Analysis Services to a NUMA node I will explain two options that I have tested and work well.

  • Windows System Resource Manager
  • NUMA Node Affinitization Using a Hyper-V Virtual Machine

One way to force Analysis Services to use only 1 NUMA node is to use Windows System Resource Manager or WSRM. To use WSRM you need to first add the feature from the “Add Roles and Features Wizard”  Once you have installed WSRM you will then need to create a Resource Allocation Policy and a ProcessMatching criteria to force SSAS to one NUMA node.  WSRM is listed as deprecated in MSDN however I haven’t found any compelling reason not to use it and it appears to be the only way to force processor affinitization with SSAS outside of virtualization http://technet.microsoft.com/en-us/library/cc753939.aspx

  1. Before using WSRM you need to identify which processors are assigned to which NUMA node.  Use CoreInfo to find out.
  2. Add Windows System Resource Manager using the Add Roles and Features Wizard as seen below

  3. Once installed open WSRM

  4. Create a new resource allocation Policy for the processors assigned to the NUMA node you SSAS to run under.  In the screenshot below I specified the first instance of SSAS to run one the first NUMA node, NUMA Node 0: processors 0 – 15 (16 processors per NUMA node on the hardware I was using below…configuration is 0 based so make sure you don’t choose 1 – 16).  I specified the second instance to run under processors 16 – 31 (NUMA NODE 1)

     

  5. Once you have your SSAS instances set up and running in WSRM you will need to modify your Analysis Services settings in the server properties or using the MSMDSRV.ini file.  Many SSAS settings are dynamic based on the number of CPUs, Memory on the machine.   For example on a server with 1.5 Terabytes of memory and 4 NUMA nodes with 16 cores a piece we would limit each instance of SSAS to the amount of memory available to that NUMA node (1536GB/4 = 384).  If SSAS is limited to 384GB and 16 cores our default values for a setting such as LowMemoryLimit will be skewed; the default of 65% would be calculated based on the overall memory of the machine which SSAS would see as 1536 GB then set the Low Memory Limit to roughly 998GB which would be way more than SSAS actually has available. To address this you need to modify your .ini file settings to account for the amount of Memory and number of CPUs you are limited.

A second way to force NUMA affinitization to SSAS instances is by using Hyper-V.  The benefit of this method as mentioned earlier in the write-up is that you do not need to modify the SSAS Properties in the .ini file to account for the fact that the SSAS will be limited to a specific group of processors and a fraction of the memory.

  1. In Hyper-V manager create a virtual machine per instance of SSAS you want to use

  2. Go to the settings of the Virtual Machine
     
  3. Access the NUMA section under Processors

  4. Set NUMA configuration to map each VM to a NUMA node.  Calculate how much memory to set as maximum depending on the total memory available per NUMA Node.  Set Maximum amount of memory and Maximum NUMA nodes accordingly.

 

In addition to setting the NUMA affinity per VM you can also set the following at the Host level.  Unchecking the option “Allow virtual machines to span physical NUMA nodes” in theory

Hopefully this post gives you a basic understanding of NUMA and why you should be aware of it when it comes to SSAS tabular mode.  If you are experiencing slow cold cache response times from your SSAS Tabular database queries, take a look at your hardware and see if it could be NUMA related.  If so, try one of the methods I mention to force affinitization to a NUMA node or test on non-NUMA hardware.

Leave a Comment
  • Please add 3 and 2 and type the answer here:
  • Post
  • Thank you for this great post. However, compare how one can achieve the same task with the SQL Server relational engine - aetting affinity to NUMA nodes and/or cores is a piece of cake in the Server porperties / processors Dialog box. Any compelling reason not to share this code with the SQL/realtional Team?

  • Thanks for the article - very useful.

    Worth also pointing out that for this to work, the new policy must be set to be the 'Managing Policy'. This is in your screenshot but not the text instructions.

    Thanks again

    Alex

  • But can I force Numa affinity for no Numa hardware?

  • But can I force Numa affinity for a no Numa hardware?

  • It looks like your second-to-last paragraph is cut off:

    Unchecking the option “Allow virtual machines to span physical NUMA nodes” in theory...

  • I am a bit confused.

    History:

    TAB & AS on a clustered environment. Has 4 numa (12 cores) . We recently did lab testing and determined that we need to lock down TAB on Numa 0 (0-11 cores)

    Action:

    Did steps 1-4 for WSRM ensuring that i tied the process for TAB and indicated in 4.1.

    Rebooted the server and when all service started --> went to task manager --> processes tab -->right click on msmdsrv.exe (for TAB) Set Affinity and i see all the boxes are checked.

    *- Could you explain what needs to be changed for TAB in  MSMDSRV.ini file

  • Hi John,

    What needs to be changed on the MSMDSRV.ini file?

    Server Layout/Details:

    ****QA****

    1 node cluster

    2 SSAS(2012) Intances - TAB & AS

    4proc*12core=48 or (0-47)

    Dell PowerEdgeM910 -- E7540 @2.00GHz with 225Gb RAM

    OS-08R2 Ent

    History:

    BI Team came to a conclusion after talking to Marco Russo (thank you very much sir for your time) that the mdx query would need to run on 1 proc or NumNode0. They saw a huge difference on query execution from 7.3 sec to ~4sec. This was achi

    Test:

    A)BaseLine Test

     1)Clear Cache(xmla)

     2)Ran mdx => ~7+sec

    B)ChangeAffinity Test

     1)Connected to TAB instance(rdp)

     2)Star Task manager

     3) Identify process for TAB; right click Set Affinity

     4) Uncheck <All Proc>

     5) Selected CPU 0-11(Node0)

     6)Clear Cache (xmla)

     7)Ran mdx => ~3.6->4.2sec

    C)WSRM Test

     1)Followed steps 2-4

     2)Brought down /up TAB resource offline

     3)Clear cache(xmla)

     2)Ran mdx => ~7+sec

    Any direction/help would be appreciated

    Thanks in advance!

    AK

  • Hi John,

    What needs to be changed on the MSMDSRV.ini file?

    Server Layout/Details:

    ****QA****

    1 node cluster

    2 SSAS(2012) Intances - TAB & AS

    4proc*12core=48 or (0-47)

    Dell PowerEdgeM910 -- E7540 @2.00GHz with 225Gb RAM

    OS-08R2 Ent

    History:

    BI Team came to a conclusion after talking to Marco Russo (thank you very much sir for your time) that the mdx query would need to run on 1 proc or NumNode0. They saw a huge difference on query execution from 7.3 sec to ~4sec. This was achi

    Test:

    A)BaseLine Test

     1)Clear Cache(xmla)

     2)Ran mdx => ~7+sec

    B)ChangeAffinity Test

     1)Connected to TAB instance(rdp)

     2)Star Task manager

     3) Identify process for TAB; right click Set Affinity

     4) Uncheck <All Proc>

     5) Selected CPU 0-11(Node0)

     6)Clear Cache (xmla)

     7)Ran mdx => ~3.6->4.2sec

    C)WSRM Test

     1)Followed steps 2-4

     2)Brought down /up TAB resource offline

     3)Clear cache(xmla)

     2)Ran mdx => ~7+sec

    Any direction/help would be appreciated

    Thanks in advance!

    AK

  • It seems WSRM (Windows System Resource Manager) is no longer present in Windows Server 2012 R2. What is the preferred way to proceed? Is setting affinity still necessary with Tabular BI in SQL 2012 SP1?

    Recommendations please, thank you!

Page 1 of 1 (9 items)