Debugging Multi-Threaded Applications I - Deadlocks

Debugging Multi-Threaded Applications I - Deadlocks

  • Comments 1
Multi-threaded code can be challenging to write and debug.  Today, I'd like to spend some time talking about one of the most common issues when writing multi-threaded code... deadlocks.

Example deadlock
To begin, let's examine an example of the "classic" deadlock -- using synchronization objects (ex: ManualResetEvent) within lock statements.

using System;
using System.Threading;

// synchronization event
private ManualResetEvent SafeToContinue = new ManualResetEvent(false);

// object being used by multiple threads
private Object SharedObject = new Object();

static void Main()
{
    // start the threads
    Thread t1 = new Thread(new ThreadStart(Thread1);
    t1.Start();
    Thread.Sleep(10000);
    Thread t2 = new Thread(new ThreadStart(Thread2);
    t2.Start();
}

Thread1 enters a lock, does some processing and waits for the other worker thread to notify it that any remaining processing can continue.
lock(this.SharedObject)
{
    // perform some work inside the lock

    // wait on the ManualResetEvent
    //  this is how the other thread(s) will let us know
    //  we can continue
    this.SafeToContinue.WaitOne();

    // perform the remainder of the work
}

Thread2 enters a lock (on the same object as Thread1), does some processing and signals the other worker thread that the applicaiton is now in a safe state for continued processing.
lock(this.SharedObject)
{
    // perform some work inside the lock

    // notify the other worker thread(s) that
    //  they may resume processing
    this.SafeToContinue.Set();
}

In the above example, depending on how much processing is performed in each thread prior to entering the lock, this may or may not result in a deadlock.  For this discussion, I have added a Sleep of 10 seconds between starting the worker threads to ensure that Thread2 will attempt to enter the lock after Thread1 has entered it.

Debugging deadlocks
Debugging deadlocks can be difficult.  In multi-threaded applications, timing is everything.  Any change in the application (addition of diagnostic logging) or runtime environment (debug build, running under the debugger) can cause application timing to be significantly altered which can cause a very consistent deadlock to not occur when you are attempting to investigate the issue.  Sometimes, we get lucky and the deadlock occurs most frequently while under the debugger.

While running under the debugger, when your application stops responding (encounters the deadlock), do the following to find the cause.

  1. Click on the Break All button (looks like a pause button on a CD/Cassette player) or press Ctrl+Alt+Break
  2. Switch to the Threads window
    If the Threads window is not open, it can be accessed via the Debug | Windows | Threads menu item or by pressing Ctrl+Alt+H
    Using the above example, the Threads window will look similar to this:
     
    ID Name Location Priority Suspend
    228702738 <No Name> ClassicDeadlock1.Program.Thread1 Normal 0
    ed983d16 <No Name> ClassicDeadlock2.Program.Thread2 Normal 0
  3. The source window will show the current source line for the currently active thread (highlighted in green, by default)
    In our example, we see that Thread1 is on the following line
    this.SafeToContinue.WaitOne();
  4. Double click on the Threads window entry for Thread2
    This will update the source view to show the line that Thread2 is currently executing.
    lock(this.SharedObject)

We have found our deadlock and can now fix the problem. 

In our example, Thread1 is waiting for Thread2 to set the SafeToContinue event while Thread2 is attempting to enter the lock that is currently held by Thread1.  As we have seen, blocking inside of a lock can lead to undesired application behavior.  It is best to perform the minimum amount of work wile inside of a lock (typically, simply updating the locked object).  By moving the SafeToContinue event usage outside of the lock, we are able to successfully run our example application.

After the fix, Thread1 becomes:
lock(this.SharedObject)
{
    // perform some work inside the lock
}

// wait on the ManualResetEvent
//  this is how the other thread(s) will let us know
//  we can continue
this.SafeToContinue.WaitOne();

lock(this.SharedObject)
{
    // perform the remainder of the work
}

And Thread2 becomes:
lock(this.SharedObject)
{
    // perform some work inside the lock
}

// notify the other worker thread(s) that
//  they may resume processing
this.SafeToContinue.Set();

Enjoy!
-- DK

Disclaimer(s):
This posting is provided "AS IS" with no warranties, and confers no rights.