CSS SQL Server Engineers

This is the official team Web Log for Microsoft Customer Service and Support (CSS) SQL Support. Posts are provided by the CSS SQL Escalation Services

How It Works: SQL Server 2005 Connection and Task Assignments

How It Works: SQL Server 2005 Connection and Task Assignments

Rate This
  • Comments 3

I have talked about how connections and tasks get assigned as PASS and during many other mentoring opportunities.  I just finished working on an issue that forced me to dig deeper into the subject.

The customer reported that on a NUMA machine the connections are always getting assign to the same node.   They had determined this by looking at the CPU usage information and the node_affinity assignments.

select endpoint_id, node_affinity, * from sys.dm_exec_connections

To set the stage the basic rules for connection to node assignments are.

  • Use only the nodes the port affinity indicates.  (See SQL Server Books Online for port to node binding details.)
  • Do round-robin connection assignment among the available nodes.  This takes into account the port affinity binding as well as the online status of the node.  (A node is considered offline when all schedulers are offline).

After more investigation I found that these rules need to be tempered with an endpoint discussion.   

select * from sys.endpoints

image

  • When the SQL Server is started it establishes the listen on endpoints.
  • Each endpoint results in an internal endpoint structure.   For the diagram above there would be an endpoint tracking the TCP Port, one tracking the Named Pipe and perhaps others for shared memory, and additional TCP ports.  This structure contains the OVERLAPPED members as well as a nextNode member.
  • Each node has an I/O completion listener (GetQueueCompletionStatus) that all endpoints and communication channels are associated with.
  • As the diagram shows each endpoint posts an async accept using its associated accept structure.   One of the IOCP listener threads is invoked (OS determined) during an connection request.   The IOCP does the minimum accept logic then looks at the nextNode information.  Incrementing the nextNode (round robin logic)  the IOCP thread queues the request to the desired node to complete login processing.
  • The accept structure is again used to post an async accept request for the designated endpoint.
  • Thus, the message is tempered by round-robin assignments within each endpoint and not across all endpoints.   It is possible to make a TCP connection followed by a Named Pipe connection and get connection affinity assignments to the same node.

    Connect / Disconnect

    Be careful when evaluating the assignments as some applications establish a couple of connections during initialization and disconnect all but one of them.   This can make it look like the connection affinity is not working properly.   Reviewing the pattern of connect and disconnects in a SQL trace along with understanding the endpoint used provides the proper insight.

    Scheduler Assignments

    Once the connection affinity is established a scheduler is selected.  

    • Only schedulers within the node are considered
    • Only online schedulers are considered
    • Only task counts are considered  (task activity/weight is not a consideration)
    • Scheduler assignment task place when the task is created
    • Scheduler assignment binding remains in affect until the task completes.

    Note:  Task can often be thought of as a batch.  Each batch is a new task request to server.

    During initial connection the scheduler with the lowest number of tasks is selected.   This scheduler becomes the preferred scheduler for the connection.  Keeping the connection on the same node and when possible on the same scheduler helps maintain resource locality and increase performance capabilities.

    After the connection is established incoming tasks for the session start by looking at the preferred scheduler.  As long as the preferred scheduler does not have a task count ~20% larger than the other available schedulers on the node the preferred scheduler is used.   If the threshold is exceeded the scheduler with the smallest task count is used.

    Parallel Queries

    Parallel queries bring a bit of a twist to the scheduler assignments.  The controlling task (think of this as your connection) always follows the scheduler assignment rules just described.

    The design is to keep all parallel threads within the same node when possible.  It does not matter if this is the same node as the controlling task.   Keeping all the parallel workers within the same locality increases performance significantly.

    When a parallel query starts it looks at the number of available workers on each node.  The node with the most available workers is selected as the node to queue the parallel worker to.   So you could have Port A bound to only Node 0 but the query runs on Node 1 when run in parallel.

    The other caveat with parallel query startup is that the SQL Server engine only looks at the available worker counts once per second.   So in rare situations it is possible for multiple connections to start parallel queries and they end up on the same node when you might expect them to use separate nodes.

    Bob Dorr
    SQL Server Senior Escalation Engineer

    Leave a Comment
    • Please add 7 and 5 and type the answer here:
    • Post
    • I am just wondering for parallel query everyone suggest to set the "max degree of parallelism Option" to 1 for all OLTP system becuase of the overhead of parallel plans execution. So what is benefit of buying server with 4 or 8 CPU?

      Thanks.

       

      [RDORR FEB 13 2008]

      http://blogs.msdn.com/psssql/archive/2008/02/13/how-it-works-sql-server-per-query-degree-of-parallelism-worker-count-s.aspx 

      The overhead of parallel is not bad it is just not a given.  Parallel is only elected if there is worker bandwidth to handle it.   What I see more than not is reliance on parallel when it won't happen when an index or query tuning exercise would be of more benefit.  I have seen lots of OLTP instances uisng MAXDOP=4 or 8 with great results. 

      The benefit of multi-CPU is that multiple queries can truly run at the same time.  If you only have one CPU you are context switching with every active query.  If you have a 4 CPU machine you distribute the load so you only contend for scheduler resources with ~1/4 of the active queries and so forth.

       

       

    • On SQLserver 2000 machine I always some cpu are more active than others. Is there any way I can find in 2000 how connection are assigned to nodes. I understand the scheduler and its assignments are rewritten in 2005.

      Thanks.

       

      [RDORR] SQL Server does not expose the scheduler id like SQL Server 2005.  You are experiancing the same issue we did in support and asked for an update.

      You can use sysprocesses KPID value and map to the thread objects in Perfmon to determine the basics.   Also use the CPU column in sysprocesses to help.

      In most cases I have sen it will be the CPU that Log Manager is assigned to.

    • Since most people in the United States do not work on Monday I decided to give you some stuff to read

    Page 1 of 1 (3 items)