Just thought I would summarize the concept of a SOS Scheduler and a Deadlocked Scheduler since many have inquired over the years for a brief summary of these concepts

 

UMS/SOS Scheduler in SQL  Server

Since SQL Server 7.0, SQL  Server has used its own scheduling mechanism, called UMS (User-mode scheduler)  in 7.0 and 2000 and later renamed to SOS (SQL on OS scheduler). UMS/SOS is a  cooperative (non-preemptive) scheduler which means that it relies on threads to voluntarily give up CPU usage - yield - to the next thread waiting in line. That means that there are locations in the SQL Server code where the Microsoft developer built in yield points, causing execution to “pause” and gracefully let another citizen in the SQL kingdom to exercise its right to execute. Of  course, once a thread "leaves" SQL Server SOS kingdom, that thread is still handed off to the Windows preemptive scheduler (ruled by another Master so to speak). The goal of SOS is for SQL Server to expose only one  thread at a time per CPU and thus minimize competition among SQL threads exposed to the OS. The ultimate goal is to reduce the very expensive kernel-mode context switching from one thread to another.

Since SQL Server uses a cooperative scheduler, it relies on the good heart of each developer who  writes code in SQL Server to call a Yield() function of some kind that prevents the thread from monopolizing the CPU. But even the best-intentioned developer could make mistakes (and introduce a  bug in his code), or a SQL thread could be at the mercy of some external component – like calling into code outside of SQL Server. Because of that, the SOS scheduler has a dedicated thread -Scheduler Monitor - that periodically checks the  state of each scheduler and reports any "irregularities". Examples of irregularities include a thread not yielding voluntarily (a non-yielding scheduler) or all schedulers are "stuck" not processing any requests (deadlocked schedulers). Typically this  Scheduler Monitor thread will report problems every 60 seconds, though it monitors the IOCP thread that accepts connections every 15 seconds.

An entire white paper could be written about how SOS scheduler works but for this discussion I will limit the information to some summary points. Actually, a white paper has been  written already - SOS Scheduler  white paper

What is a Deadlocked  Scheduler?

 A deadlocked  scheduler scenario is reported when ALL SOS Schedulers are reported to be  "stuck". Technically, this means that since the last Scheduler Monitor check -  say about 60 seconds earlier - nothing has changed and no work is being processed. Here are some of things checked:

  • Are there any tasks waiting to be processed?  (check if sys.dm_os_schedulers.work_queue_count > 0)
  • Have any new worker threads been created since last check?
  • Has any work been processed since last check (i.e. have any new tasks been assigned to workers)?

If any of these conditions is met (not all), the Scheduler Monitor declares  that schedulers are stuck - deadlocked.

A note on NUMA: SQL  Server can report Deadlocked Schedulers if all the SOS schedulers on a single NUMA node  are stuck, even if other NUMA nodes are processing tasks just fine.

When the Scheduler  monitor detectes a "deadlock schedulers" condition, it reports an error in the  Errorlog and triggers a memory dump to be generated by the SQLDumper.exe against  SQL Server process.

 

What "Deadlocked Schedulers" is not?

The phraase  "deadlocked schedulers" has confused many and rightly so. The term does NOT imply a classic Lock-Manager deadlock where multiple sessions are trying to access locks in the opposite order and permanently block each other. The term applies to SOS Schedulers and the fact that none of them are processing queries, logins, etc. So if you are thinking of  "traditional" lock deadlocks, you may steer away from such thoughts.

 

Most Common Causes of  Deadlocked Schedulers:

Even though many  conditions can cause all schedulers to be stuck at the same time, in 95% of the time that I have looked at memory dumps, the reason has been that the majority of  worker threads were tied up waiting for a lock resource. Yes, the good ol' blocking  chains with a head blocker that you are so familiar with, is the most common  example. This causes the number of available SQL worker threads to be exhausted.  The worker threads themselves are tied up because they are waiting for a lock and when new tasks (work) is submitted to the SQL Server, there are no worker threads available to pick the tasks up and execute them. Of courses, Locks are not the only resource you see contention on. Here is the list of other types of resources that have  caused long "blocking" (I use the term loosely here) chains.

 

  • Lock(s) - a very long blocking chain
  • Latch(es)
    • I/O latches - which means stuck I/O
    • Orphaned/unreleased page or non-page latches (commonly due to a previous AV exception or assert)
  • Spinlock(s)
    • Orphaned/unreleased spinlock (due to a previous AV, assert or other  exception)
  • Parallel queries
    • Huge number of child threads spawned causing exhaustion of worker threads on  the system to service one or two queries

 

How to Troubleshoot Deadlocked Schedulers

Some deadlock scheduler scenarios will resolve by themselves - as in the case of parallel queries, long-held I/O latch or, rarely, blocking.  Orphaned structures like spinlocks or latches would require that you restart SQL Server. In most realistic cases, you will be aware of the issue only after it has been going on for awhile or after it has resolved itself. Therefore, in those cases, you will be focusing on post-mortem analysis of the problem. However, in rare cases, you may be able to take action while the problem is happening and resolve it (see using DAC connection below).

 

There are two ways to find the root cause of deadlocked schedulers:

1.Using DAC connection and examining some DMVs

2.Examining a memory dump

 

1. Troubleshooting Deadlocked Schedulers using the DAC connection

In essence, the sole purpose of the Dedicated Admin Connection (DAC) is to troubleshoot server problems like deadlocked schedulers. That is, if you catch it "live", while it is happening. You can find details on how to connect to SQL Server using DAC here. The most effective use of DAC to investigate and resolve deadlock scheduler problems is if the issue is caused by blocking. Luckily, this is the most common reason for deadlocked schedulers so DAC can be an effective tool.

So, once you connect with DAC (I would suggest using sqlcmd -A instead of SSMS, since the latter may attempt multiple connections to SQL Server) , you can investigate a couple of things:

 1. The state of all schedulers - sys.dm_os_schedulers and specifically if there is any queued work that is waiting.

 2. If there are hundreds of blocked threads - sys.dm_exec_requests - Wait_type = LCK_*   (any type of lock wait)

 3. Identify the head blocker from sys.dm_exec_requests by manually tracing which session is blocking all others, and terminate it by using the KILL command.

 

Note: If the issue is caused by an orphaned latch or orphaned spinlock, then the only way to get out of this situation is to restart SQL Server.

 

2. Troubleshooting Deadlocked Schedulers by Examining a Memory Dump

 

Most commonly the best way to find the root cause of deadlocked schedulers is to figure out what the majority of worker threads are stuck behind. And the way to do this by examining the memory dumps that SQL Server automatically generates for this purpose.

Here are some examples of stacks that you will observe if you were to examine a memory dump (details on how to do so using public symbols can be found here)

Blocking:

A typical call stack of all the threads that are stuck on Lock waits will include the LockOwner::Sleep() function. Here is a example

ntdll!KiFastSystemCallRet
ntdll!ZwSignalAndWaitForSingleObject+0xc
kernel32!SignalObjectAndWait+0xaf
sqlservr!SOS_Scheduler::Switch+0x81
sqlservr!SOS_Scheduler::SwitchContext+0x2fd
sqlservr!SOS_Scheduler::SuspendNonPreemptive+0xb9
sqlservr!SOS_Scheduler::Suspend+0x2e
sqlservr!SOS_Event::Wait+0x12a
sqlservr!LockOwner::Sleep+0x13e
sqlservr!lck_lockInternal+0x990
sqlservr!GetLock+0x178
sqlservr!BTreeRow::AcquireLock+0x14c
sqlservr!IndexRowScanner::AcquireNextRowLock+0x1da
sqlservr!IndexDataSetSession::GetNextRowValuesInternal+0x53a
sqlservr!IndexDataSetSession::GetNextRowValues+0x2d
sqlservr!RowsetNewSS::FetchNextRow+0x3e
sqlservr!CQScanRowsetNew::GetRowWithPrefetch+0x2b
sqlservr!CQScanRangeNew::GetRow+0x141
sqlservr!CQScanTopNew::GetRow+0x108
sqlservr!CQScanUpdateNew::GetRow+0x76
sqlservr!CQueryScan::GetRow+0x5a
sqlservr!CXStmtQuery::InitForExecute+0x51
sqlservr!CXStmtQuery::ErsqExecuteQuery+0x291
sqlservr!CXStmtDML::XretDMLExecute+0x284
sqlservr!CXStmtDML::XretExecute+0x8b
sqlservr!CMsqlExecContext::ExecuteStmts<0,1>+0x7cf
sqlservr!CMsqlExecContext::FExecute+0x550
sqlservr!CSQLSource::Execute+0x73b
sqlservr!CStmtPrepQuery::XretExecute+0x45b
sqlservr!CExecuteStatement::XretExecute+0x1b
sqlservr!CMsqlExecContext::ExecuteStmts<1,1>+0x360
sqlservr!CMsqlExecContext::FExecute+0x55b
sqlservr!CSQLSource::Execute+0x73b
sqlservr!process_request+0x2e3
sqlservr!process_commands+0x2e0
sqlservr!SOS_Task::Param::Execute+0xe2
sqlservr!SOS_Scheduler::RunTask+0xb9
sqlservr!SOS_Scheduler::ProcessTasks+0x141
sqlservr!SchedulerManager::WorkerEntryPoint+0x1a8
sqlservr!SystemThread::RunWorker+0x7f
sqlservr!SystemThreadDispatcher::ProcessWorker+0x225
sqlservr!SchedulerManager::ThreadEntryPoint+0x143
msvcr80!_callthreadstartex+0x1b
msvcr80!_threadstartex+0x66
kernel32!BaseThreadStart+0x34

 

Orphaned/unreleased Latch:

You will observe that most thread that are stuck behind a latch will contain the LatchBase::Suspend() call. Here is an example

 

 ntdll!ZwWaitForSingleObject+0xa
 kernel32!WaitForSingleObjectEx+0x130
 sqlservr!SOS_Scheduler::SwitchContext+0x623
 sqlservr!SOS_Scheduler::SuspendNonPreemptive+0xc5
 sqlservr!EventInternal<Spinlock<149,1,0> >::Wait+0x428
 sqlservr!LatchBase::Suspend+0x42c
 sqlservr!LatchBase::AcquireInternal+0xfa
 sqlservr!CTraceController::FAcquireLock+0x53
 sqlservr!CTraceController::CAutoLock::FAcquire+0x18
 sqlservr!CTraceController::ProduceRecord+0xb58456
 sqlservr!CRpcTraceHelper::TracePostRPC+0x347
 sqlservr!CRpcTraceHelper::TracePostExec+0xfc8b44
 sqlservr!CRPCExecEnv::OnExecFinish+0xaf
 sqlservr!process_request+0x399
 sqlservr!process_commands+0x1ba
 sqlservr!SOS_Task::Param::Execute+0x11b
 sqlservr!SOS_Scheduler::RunTask+0xca
 sqlservr!SOS_Scheduler::ProcessTasks+0x95
 sqlservr!SchedulerManager::WorkerEntryPoint+0x110
 sqlservr!SystemThread::RunWorker+0x60
 sqlservr!SystemThreadDispatcher::ProcessWorker+0x12c
 sqlservr!SchedulerManager::ThreadEntryPoint+0x12f
 msvcr80!_callthreadstartex+0x17
 msvcr80!_threadstartex+0x84
 kernel32!BaseThreadStart+0x3a 

 

Orphaned/unreleased Spinlock:

Here is what a stack might look like from a thread stuck behind a spinlock - note the SpinlockBase::Sleep() 

 

 ntdll!ZwDelayExecution+0xa
 kernel32!SleepEx+0xaf
 sqlservr!SpinlockBase::Sleep+0x15d5a1
 sqlservr!SpinlockBase::SpinToAcquire+0x11b
 sqlservr!TSyncHashTable_EntryAccessor<...>::TSyncHashTable_EntryAccessor<...>+0x76
 sqlservr!CQSIndexStatsMgr::UsedFor+0x68
 sqlservr!CQueryScan::Uncache+0x4788a6
 sqlservr!CXStmtQuery::SetupQueryScanAndExpression+0x252
 sqlservr!CXStmtQuery::ErsqExecuteQuery+0x232
 sqlservr!CXStmtSelect::XretExecute+0x110
 sqlservr!CExecStmtLoopVars::ExecuteXStmtAndSetXretReturn+0x13
 sqlservr!CMsqlExecContext::ExecuteStmts<1,0>+0x9e5
 sqlservr!CMsqlExecContext::FExecute+0x4c5270
 sqlservr!CSQLSource::Execute+0x36d
 sqlservr!ExecuteSql+0x700
 sqlservr!CSpecProc::ExecuteSpecial+0x4c7
 sqlservr!CSpecProc::Execute+0x1d5
 sqlservr!process_request+0x312
 sqlservr!process_commands+0x1c4
 sqlservr!SOS_Task::Param::Execute+0xee
 sqlservr!SOS_Scheduler::RunTask+0xc9
 sqlservr!SOS_Scheduler::ProcessTasks+0xb4
 sqlservr!SchedulerManager::WorkerEntryPoint+0xe7
 sqlservr!SystemThread::RunWorker+0x59
 sqlservr!SystemThreadDispatcher::ProcessWorker+0x130
 sqlservr!SchedulerManager::ThreadEntryPoint+0x128
 msvcr80!_callthreadstartex+0x17
 msvcr80!_threadstartex+0x84
 kernel32!BaseThreadStart+0x3a

 

 

Namaste!

 

Joseph