Slava Oks's WebLog

  • Changing gears

    You haven't heard from me in while and there is a reason. I moved from SQL Server's Storage Engine team to a new incubation team that is involved in exploring new directions in operating systems. This subject area has been always very fascinating for me as you easily might deduce from my previous posts. The new team is amazing. It is small but extremely talented. Chris is one of the lead architects on the team. (I imagine many of you know him from your CLR and SQLCLR experience)

    It was sad for me to move out of SQL Server. I just love the team and product. However I couldn't let go such great opportunity :-).  In the future expect my posts to become more operating system related - Yes I will try to continue blogging as I gain more knowledge and experience.  At the same time please don't hesitate to ask questions about SQL Server, if I don't have answers I will forward them to people that do. To some of you I owe email replies. I am sorry for being slow in that regard but I will try to catch up in the next week or so.

    By the way if you have passion in operating system area, consider yourself hard working and fun person, feel free to contact me, we might need you :-).

     Change is always good and I really enjoy it :-).

    As usual if you have questions don't hesitate to ask. 

  • Q and A: Ratio between AWE Allocated and VM Committed/Reserved in DBCC MEMORYSTATUS output

    Q: What kind of ratio should you expect to see between AWE Allocated and VM Committed/Reserved?  We are running 8GB with a cap of 7GB in the max server memory and yet the VM Committed is around 100MB.  I was first shocked to see that the sqlservice in task manager only shows that SQL is using around 100MB and now that I have used MEMORYSTATUS I'm even more troubled.Does this mean that our server is starved for memory even though AWE can see almost all of the 7GB?  On a side note we are seeing very high CPU utilization and my trail has come to a potential memory issue.

    Memory Manager                  KB

    ------------------------------ --------------------

    VM Reserved                    1674140

    VM Committed                   104384

    AWE Allocated                  6747704

    Reserved Memory                1024

    Reserved Memory In Use         0

    A. Actually these two values are independent. VM Committed shows amount of physical memory SQL Server allocated using VirtualAlloc's MEM_COMMIT parameter. AWE allocated shows amount of memory allocated using AllocateUserPhysicalPages. Sum of these two counters gives an approximate amount of physical memory SQL Server uses. The reason it is an approximate amount is because some external components loaded in SQL Server address space allocate memory outside of SQL Server memory manager framework. Consequently it is impossible to output exact amount. 

    In your case you shouldn't worry - SQL Server does use physical memory you configured it to use. High CPU utilization that your system experience might be due to memory pressure or to other reasons. You need to find out if SQL Server hits memory pressure and if it does classify it and fix it appropriately.  You can find more on memory pressure in my previous posts related to memory: http://blogs.msdn.com/slavao/archive/tags/SQL+Server+Memory+Management/default.aspx or in recently published book on SQL Server 2005 troubleshooting http://www.amazon.com/o/ASIN/0321447743/ref=s9_asin_image_1/002-4273860-1748002. The book contains a chapter on how SQL Server memory manager works as well as how to recognize and fix different types of memory related issues.

  • Q & A: I am running on 64 bit and all of my tasks are waiting on SOS_RESERVEDMEMBLOCKLIST, What is going on?

    This wait type represents waiting on multi page allocations inside of SQLOS's memory manager. As you remember from my description of memory manager's working internals, it has two major allocators single page allocator, SPA and multi page allocator, MPA. Majority of the time SQL Server internal code uses SPA, however in some cases it will use MPA, i.e. for example when request exceeds size of single page, 8KB.

    On 64 bit systems SQL Server internal code can potentially generate larger memory requests which might exceed single page size - this is due to the fact that size of internal data structures on 64 bit is larger than on 32 bit platforms. A larger amount of concurrent MPA allocations could translate into more allocations served by MPA rather than by SPA.  More concurrent allocations might possibly introduce more pressure on MPA and that could translate into SOS_RESERVEDMEMBLOCKLIST waits.

    In SQL Server 2005 SP2 we have addressed this problem by making MPA significantly more concurrent.

    If you see such issue and can't wait until SP2 is out,  please contact MS Product support to get a fix.

     Let me know if you have any questions.

  • SQLOSDMV's Continue

    sys.dm_os_waiting_tasks

    One can run lots of interesting queries using this view.  You can even use this view to perform deadlock detection that is not resolvable by deadlock monitor, DM. For example if you have tasks waiting on external resources such as extended stored procedures and blocking others from running. This type of deadlock DM can’t detect but you can!

     

    1. Q. How many tasks are currently waiting?

     

    select

    count(*)

    from

    sys.dm_os_waiting_tasks

     

    This query will give you an idea of how many tasks are waiting in the system. You can use this information to understand blocking characteristics of your load

     

    1. Q. How many tasks that assigned to a worker (thread/fiber) are waiting?

     

    select

          count(*)

    from

          sys.dm_os_waiting_tasks

    where

          wait_type <> 'THREADPOOL’

     

    This query shows how many threads are actively running in the system. Latter on I will show how to find out if number of threads can be increased

     

    1. What are the tasks waiting on?

     

    select

          wait_type,

          count (*)

    from

          sys.dm_os_waiting_tasks

    group by

          wait_type

    order by

          count (*) desc

     

    One can use this query to investigate possible bottlenceks of an active load. This query groups tasks by wait type – it can’t be directly use to identify the actual bottlenecks on the system. The query gives you an idea about the wait characteristics of your load

     

    1. Q. Does my load have an active resource bottleneck?

    You can answer this question by looking at the resource address that tasks are blocked on.  Keep in mind that not all wait types have resource associated with them.

     

    select

          resource_address,

          count (*)

    from

          sys.dm_os_waiting_tasks

    WHERE

          resource_address <> 0

    group by

          resource_address

    order by

          count (*) desc

     

    1. Q: Is my system can be possibly bottlenecked on I/O?

    You can answer this question by looking at the wait type of tasks waiting on specifically you are interested in IO waits

     

    select

          *

    from

          sys.dm_os_waiting_tasks

    where

          wait_duration_ms > 20 AND

          wait_type LIKE '%PAGEIOLATCH%'

     

    You might want to change 20ms base on your I/O subsystem

     

    1. Q: Does my load have long waiting chains?

     

    This information is particular interesting to understand if a single tasks, for example one that generated long I/O, blocks others. If this happens you will have a way to improve your scalability by figuring how to remove or minimize chain length.

     

    WITH TaskChain (

    waiting_task_address,

    blocking_task_address,

    ChainId,

    Level)

    AS

    (

    -- Anchor member definition: use self join so that we output

    -- Only tasks that blocking others and remove dupliates

     SELECT DISTINCT

          A.waiting_task_address,

          A.blocking_task_address,

          A.waiting_task_address As ChainId,

        0 AS Level

    FROM

          sys.dm_os_waiting_tasks as A

    JOIN

          sys.dm_os_waiting_tasks as B

    ON

          A.waiting_task_address = B.blocking_task_address

    WHERE

          A.blocking_task_address IS NULL

    UNION ALL

    -- Recursive member definition: Get to the next level waiting

    -- tasks

    SELECT

          A.waiting_task_address,

          A.blocking_task_address,

          B.ChainId,

          Level + 1

    from

          sys.dm_os_waiting_tasks AS A

    JOIN

          TaskChain AS B

    ON

          B.waiting_task_address = A.blocking_task_address

    )

    select

          waiting_task_address,

          blocking_task_address,

          ChainId,

          Level 

    from

          TaskChain

    order by

          ChainId

     

    If there are no chains, your load is not CPU bound and you see long waits on THREADPOOL, you might improve your throughput by increasing a number of threads in the system.

     

    Keep in mind that you can extend this query to perform your own deadlock detection.

     

    You can also find out more information about each individual wait_type here

    http://msdn2.microsoft.com/en-us/library/ms179984.aspx

  • Q & A: Does SQL Server always respond to memory pressure?

    Q: I thought I've seen reference recently to problems with SQL2005 not releasing memory when under pressure, and this was possibly due to the Lock Pages In Memory option. We have a server where this seems to be happening, i.e. SQL2005 uses all available memory and the server basically dies because there's no memory left for other processes.

    How do I make it work like SQL2000 where it will release memory when it detects memory pressure from other applications?

    A: The memory behavior is slightly different between SQL Server 2005 and SQL Server 2000.  SQL Server 2000 won't respond to memory pressure when running with lock pages in memory option so you will have to configure max server memory setting in such a way that you leave enough free memory to Windows and other applications on the machine (see memory setting recommendations below)

    SQL Server 2005 does respond to memory pressure both when lock pages in memory is enabled and disabled. However there is a caveat that you need to be aware of - when OS hits memory pressure, depending on the state of the machine and applications,  it could be really slow - attempting to free up the resources. This means that SQL Server might not get enough CPU cycles to respond to memory pressure quickly and hence causing  other applications and possibly OS to return OOM errors. In case when lock pages in memory is not enabled it is possible that OS will page out SQL Server. This situation might result in inappropriate SQL Server performance.

    You can avoid such conditions by setting max memory setting to appropriate value. So what is this value? I usually recommend to derive this value based on amount of RAM available on the machine and amount of available memory you need to keep OS and other applications happy.

    RAM Installed        Available Memory (as reported by TM) under heavy load you would want to have

    < 4GB                   512MB - 1GB
    4-32GB                 1GB - 2GB
    32GB - 128GB       2GB-4GB
    128GB -                4GB-

    There are couple of things you need be aware of:

    A. You only need to consider setting max server memory if you expect load on the machine to be memory bound

    B. You need to look at the machine during heavy load to come up with appropriate max server memory setting so that you keep available memory in recommended range

    C. The more RAM you have on the machine (32GB+)  the more important to have max server memory setting on.

    D. My recommendations are "personal" recomendations, you will need to tune your box to find what is the best combination for you.

    E. Max worker threads setting does affect how much memory server will require under heavy concurrent load. Consider on x64 platform each SQL Server thread can consume 2MB of physical memory. So if you configure SQL Server for 2000 threads, under heavy load in addition to max server memory it might require additional 4GB of physical memory (2MB * 2000), it is even higher on IA64, i.e 8GB (4MB * 2000)

    F. Max server memory setting only controls size of the Buffer Pool and hence doesn't affect allocations going through Multi Page Allocator (MPA). If your load does require MPA you will have to take that into account as well. You can monitor your MPA usage by using sys.dm_os_memory_clerks DMV.

    G. Memory allocations requested by external components such as xps, COM (sp_OACreate), and others are not controlled by max server memory setting

    Let me know if you have more questions

    Thanks

    -Slava

     

  • SQLOS's DMVs Continue

    sys.dm_os_schedulers

    1. Q. Do I need to by more CPUs?

    In order to answer this question you have to find out if your load is really CPU bounded.  Your load is really CPU bounded if a number of runnable tasks per each scheduler always greater than 1 and all of your queries have correct plan.  The latter statement is very important, your load can be CPU bounded due to the fact that somehow optimizer generated bad plan – it can happen if your statistics out of date or you tried to perform handcrafted optimization. In this case you don’t want to run to Circuit City to buy more CPUs right a way – you want to fix the plan. Here is the query to find out average length of a runable queue on the system:

     

    select

    AVG (runnable_tasks_count)

    from

    sys.dm_os_schedulers

    where

    status = 'VISIBLE ONLINE'

     

    Buying more CPUs has also to do with capacity planning. You have to be very careful when performing capacity planning on hardware with HT enabled – remember you don’t have extra physical CPUs. Keep in mind that if your load runs at 60% CPU utilization - it doesn’t mean that you have 40% of extra CPU capacity. You will be very surprise how fast CPU load will jump from 60% to 80% and then even faster to 100% once you apply more and more load.

     

    1. Q. What is affinity of my schedulers to CPUs?

     

    select

          scheduler_id,

          CAST (cpu_id as varbinary) AS scheduler_affinity_mask

    from

    sys.dm_os_schedulers

     

    1. Does my machine have either hard or soft NUMA configuration enabled?

     

    select

          CASE count( DISTINCT parent_node_id)

          WHEN 1 THEN 'NUMA disabled'

          ELSE 'NUMA enabled'

          END

    from

          sys.dm_os_schedulers

    where parent_node_id <> 32

     

    1. Q. Should I configure SQL Server to use more threads – sp_configure ‘max server threads’?

    You can answer this question by looking at the work queue length for each scheduler. If on average such value is above 1 then you might benefit from adding more threads to the system but only if

    A.  Your load currently is not CPU bounded (See info above on how to find out if your load is CPU bound)

    B. Your load currently doesn’t experience any other heavy waits (If you add more threads in this case they will just end up waiting as everyone else)

     

    select

    AVG (work_queue_count)

    from

    sys.dm_os_schedulers

    where

    status = 'VISIBLE ONLINE'

     

    1. Q: Is my system I/O bound?

    You can answer this question by monitoring length of I/O queues.

     

    select

          pending_disk_io_count

    from

          sys.dm_os_schedulers

     

     If over time they keep on growing or you are seeing periodic jumps or numbers stay relatively high most likely your system is I/O bound. In order to identify the cause you will have to dive further.

     

  • SQLOS's DMVs

    SQL Server 2005 has a large number of DMVs that you can take advantage of especially when you try to understand behavioral characteristics of your SQL Server 2005 installation. DMVs expose enormous amount of information but how can you make use of it? What questions you can answer using DMVs? 

     

    So in order to make you life a bit easier, I would like to start series of posts dedicated to SQLOS DMVs with emphasis on actual problems.

     

    Following set of DMVs is exposed by SQLOS or SQLOS related infrastructures.

     

    sys.dm_os_sys_info

    sys.dm_os_schedulers

    sys.dm_os_waiting_tasks

    sys.dm_os_wait_stats

    sys.dm_os_systhrteads

    sys.dm_os_virtual_address_dump

    sys.dm_os_latch_stats

    sys.dm_os_hosts

    sys.dm_os_buffer_descriptors

    sys.dm_os_perf_counters

    sys.dm_os_ring_buffers

    sys.dm_os_tasks

    sys.dm_os_workers

    vsys.dm_os_memory_clerks

    sys.dm_os_memory_cache_counters

    sys.dm_os_memory_cache_clock_hands

    sys.dm_os_memory_cache_hash_tables

    sys.dm_os_memory_cache_entries

    dbcc memorystatus

     

    I listed them in a random order J. You can find description for each one of them in BO so I won’t repeat the description here. The main point is figure out, understand, how to make use of them.

     

    Once we are done with all posts you will be able to:

    -         Find out if you need to buy more CPUs

    -         Find out if you need to buy more memory

    -         Find out if you running under memory pressure

    -         Find out if you have problems with disks

    -         Find out if you need to move to 64bit based solution

    -         Find out if you need to change your application in order to take advantage of force parameterization

    -         Find out what applications you loaded in SQL Server address space

    -         Find out if SQL Server is paged out and if it affects performance of your application

    -         Find out if your hardware is NUMA

    -         And much more

     

    So here it goes:

    sys.dm_os_sys_info

    1. Q. How many sockets does my machine have?

    select

    cpu_count/hyperthread_ratio AS sockets

    from

    sys.dm_os_sys_info

     

    1. Q. Is my machine hyper threaded?

    Well unfortunately you can’t derive this information using this DMV today though there is a column called hyperthread_ratio. On the other hand this column can tell you:

    Q. How many either cores or logical CPU share the same socket?

    select

    hyperthread_ratio AS cores_or_logical_cpus_per_socket

    from

    sys.dm_os_sys_info

     

    1. Q. Does my 32 bit system have /3GB or /Userva switch in boot.ini?

    select

         CASE

               WHEN virtual_memory_in_bytes / 1024 / (2048*1024)

    < 1 THEN 'No switch'

               ELSE '/3GB'

         END

    from sys.dm_os_sys_info

     

    1. Q. How much physical memory my machine has?

    select

          physical_memory_in_bytes/1024 AS physical_memory_in_kb

    from

          sys.dm_os_sys_info

     

    1. Q. How many threads/workers SQL Server would use if the default value in sp_configure for max worker threads is zero:

    select

          max_workers_count

    from

          sys.dm_os_sys_info

     

    1. Q. What is a size of AWE window or what is the max amount of memory right now that can be used by QO, QE and Caches simultaneously when SQL Server running in AWE mode?

    select

          bpool_visible

    from

          sys.dm_os_sys_info

     

    I will be happy to answer your questions.

  • Talking points around SQL Server 2005 scalability

    Couple of weeks ago I did a talk on SQL Server 2005 scalability. The actual talk was not about the enormous amount of features that you can leverage in SQL Serve 2005 to make your application scale but rather how SQL 2005 achieves its scalability. In this post I present a set of talking points of the talk. Please let me know if you have questions.

    Enjoy!

     

    SQL Server 2005 Scalability

    “If you remove all global state your scalability will be linear”

     

    SQL Server 2005 Scalability

    nSQL Server 2005 is designed to scale!

    nNo matter how good database engine is if your application is designed to use global state – you will hit a bottleneck sooner or latter

    nTo get your application to scale – remove completely or at least partition global state your application is using

     

    What is SMP

    nSMP – Symmetric Multi-Processing

    nFront-bus point of contention

    nDifficult to scale beyond 32 CPU; Why? (Because of inherit global resource – front bus)

     

    What is NUMA

    nNUMA – Non-Uniform Memory Access

    nMinimize/eliminate front-bus contention to surpass scalability limits of SMP architecture

    nPerformance penalty for accessing foreign node memory

    n Server application such as SQL Server need to be NUMA-aware to take advantage of the node-locality design

    nNUMA scales; Why? (Because it partitions global resource – front bus)

     

    What is Interleaved-NUMA

    nEnable NUMA hardware to behave as SMP

    nMemory allocated from all nodes to average out memory access penalty

    nSQL Server 2000 should use interleaved-NUMA

     

    What is Soft-NUMA

    nActivates custom SQL Server NUMA configuration on top of any of hardware

    nRegistry settings control final SoftNUMA configuration

    nProvides greater performance, scalability, and manageability on SMP as well as on real NUMA hardware

     

    Effect of NUMA on Buffer Pool

    nBuffer pool will use remote (foreign) memory if necessary

    nMin/Max memory divided per node

    nE.g., 16gb max memory on 4-node NUMA, per node max is 4GB

    nAffinity mask change resulting in offlining node will redistribute min/max memory on remaining nodes, need to reconfigure min/max setting

    nNew performance counter objects – SQL Server: Buffer Nodes

     

    Application Partitioning

    nProvides resource partitioning of such resources as CPU and memory across different application for single SQL Server instance

    nAchieved by leveraging either NUMA or SoftNUMA configuration along with other new SQL Server features

    nEnables predicted resource distributions amongst different applications

    nEnables soft application isolation with better performance characteristics than multi-instance

     

     

  • Q & A: I can't make SQL Server 2005 32 bit version to use all memory on the machine

    I’m currently using the RTM+SP1 x86-SQL2005 build on a server which has 32GB of physical memory.  I notice that during ramp-up the memory usage (as seen under task manager) increases to about 28GB (i.e. “Available memory” reduces to 4GB) and stays there.   

     

    1. Is there a way to get around this and make SQL use at least a part of the remaining 4GB?So far for the “max server memory (MB)” I tried values of 2147483647(default setting) and 30000.  Both resulted in similar behavior as described above.  Attached is output of my sp_configure. I also tried to set min=max without any success

     

    1.  As all of you know limited amount of Virtual Address Space, VAS, might be a problem on 32bit systems.  For  SQL Server 2005 we made a decision to minimize VAS issues at the cost of amount of max physical memory SQL Server can actually use. The reason we made such decision is because for SQL Server VAS problems are much more severe usually than limitations of physical memory.  As you might expect, for SQL server to use additional physical memory it does need to allocate extra VAS regions. It means that for larger amounts of physical memory SQL Server will use more of VAS. If you want SQL Server 2005 to make use of all physical memory, basically get behavior of SQL Server 2000, you can use trace flag 836 during SQL Server startup. Remember this behavior is very 32bit specific

     

    Hope this helps

  • Set of new features in SQL Server 2005 enabled by SQLOS

    Couple of weeks ago I had a chance to visit set of our customers and talk to them directly. It was an unforgettable experience. As a part of the visit I made set of presentations related to SQLOS and new features it enables in SQL Server 2005. I thought that some of you might be interested to take a look at the presentation so below is presentation's outline. Let me know if you have any questions.

     

    What is New?

    • Dynamic affinity
    • Load balancing
    • Dynamic memory settings
    • Native NUMA support
    • SoftNUMA support
    • Application partitionning
    • Memory Broker
    • Common caching framework
    • DAC
    • Diagnostics

     

    Dynamic Affinity

    • Changes set of CPUs SQL Server is allowed to use
    • Use “sp_configure affinitymask*” to change SQL Server’s affinity. No longer requires SQL Server reboot
    • Enables great consolidation story for SQL Server and other applications on the same machine
    • Enables dynamic reconfiguration during failover
    • Before: SQL Server has to be rebooted for affinity changes to take effect

     

     

    Load Balancing

    • Distributes requests amongst schedulers
    • Connections are no longer bounded to specific CPU, scheduler
    • Homogeneous CPU load – doesn’t overload given CPU when using hard affinity and hence improves performance both response time or throughput
    • Before: Connection were bound to a given Scheduler/CPU when opened

     

    Dynamic memory settings

    • Controls amount of physical memory SQL Server can use
    • Use sp_configure max server memory or min server memory. No longer requires SQL Server restart for changes to take affect
    • Enables great consolidation story for SQL Server and other applications on the same machine
    • Provides dynamic reconfiguration during failover
    • Before: SQL Server had to be rebooted for memory changes to take effect when using AWE or locked pages in memory

     

    Native NUMA Support

    • Mimics hardware configuration
    • Automatically enables NUMA support on real NUMA hardware
    • Provides great scalability and performance on high end hardware
    • Enables application partitioning for better scalability, configuration and maintenance
    • Acts as perfect consolidation story along with dynamic affinity and memory settings
    • Before: Very restricted support for NUMA, only enabled with a specific trace flag

     

    SoftNUMA Support

    • Activates custom NUMA configuration on top of any hardware
    • Registry settings control SoftNUMA configuration
    • Provides greater performance, scalability, and manageability on SMP as well as on real NUMA hardware
    • Before: No SoftNUMA support

     

    Application Partitioning

    • Provides resource partitioning of such resources as CPU and memory across different application for single SQL Server instance
    • Achieved by leveraging either NUMA or SoftNUMA configuration along with other new SQL Server features
    • Enables predicted resource distributions amongst different application
    • Enables soft application isolation with better performance characteristics rather than multi instance
    • Before: Limited application partitioning support – only with VIA
    • Memory Broker
    • Provides dynamic memory distribution amongst large internal memory consumers such as Buffer Pool, Optimizer, Query Execution and Caches
    • Enabled automatically
    • Tunes memory usage by each memory consumer according to a specific load
    • Before: No dynamic memory redistribution

     

    Common Caching Framework

    • All caches share single common caching frame work
    • Enabled automatically
    • Provides finer grain caching control
      • All caches play together
      • Size of a single cache is restricted
      • New command dbcc freesystemcache
    • Single, common, response to different types of memory pressure
    • Before: Only database cache and procedure cache were sharing caching framework

     

    Dedicated Admin Connection (DAC)

    • Provides access to overloaded server
    • DAC resources are allocated during SQL Server startup
    • Great to control runaway queries that consume all resources
    • Can be used to run diagnostic queries
    • Considered to be a last resort to connect to unhealthy server
    • Caution: Don’t run complex queries
    • Before: No DAC suported might result in SQL Server restart

     

    Diagnostics

    • Enables rich diagnostic and understanding of resource distribution/consumption by different components inside of SQL Server
    • Exposed through OS performance monitor and set of dynamic management views, DMVs
    • Enables DBA and Microsoft support engineers to resolve
      • Memory problems
      • High CPU utilization
      • High contention points
      • I/O subsystem bottlenecks
    • Before: Performance monitor only