Detecting and automatically dumping hung GUI based windows applications..

Detecting and automatically dumping hung GUI based windows applications..

Rate This
  • Comments 11

Written by Jeff Dailey 

My name is Jeff,  I’m an Escalation Engineer on CPR Platforms team.  Following Tate’s blog on scoping hangs I’d like discus a common category of hangs and some creative ways to track them down.  I will be providing a couple of labs to go with this post that you can run and debug on your machine and I will also be showing you how to write a hang detection tool that will dump processes that go unresponsive.  In addition to this I will be writing several more blog entries about the various hang scenarios contained in the badwindows.zip that is included with this blog.

GUI Hangs

Sometimes a Windows application that is GUI (Graphical User Interface) based, that is to say uses windows, buttons, scroll bars etc. may stop responding (Not Responding status in Task Manager).  When this happens in most cases the rest of the operating system seems to continue functioning ok.  However the application does not repaint or respond to mouse clicks or key strokes.  Sometimes these types of problem may be transient.   Your app may hang once or twice a day for 10-30 seconds.  In other cases it may hang for long periods of time or never recover.

To get a better understand of this scenario it’s important to understand that all GUI based Windows applications work by passing messages to one another via a message queues.   Each Windows application typically has a single main thread that is responsible for processes these messages.   Though the application may be multi threaded there will typically be one thread processes messages.  This functionally is normally implement in WinMain.    This thread does different tasks based on the messages it receives.   It could open a dialog, create anther thread, or take actions based on a mouse click of even send a message to another Windows application or applications.

When your application stops responding it’s generally due to this thread making a blocking / synchronous call that takes too long.  If the thread is unable to pull incoming messages from the OS it will appear to be hung.   Most of the time once you have the dump of the process you can look at thread 0 by doing a ~0s in cdb or windbg.  Then do a KB and see what the thread is blocking on or possibly looping in that is preventing it from processing messages.  If thread 0 is not the thread processing your messages you may be able to find it by dumping all the thread stacks, ~*kb

The problem is you may not be able to fire up cdb or windbg to get a dump in time.  Or you may have a non technical user community that does not know about debuggers or creating dumps.  In this case you can do what I sometimes do. 

Create your own tool.

That’s right.  Sometimes I will see a scenario that warrants a slightly more elegant solution and there is nothing more powerful than a determined engineer and a C complier.   

What is required?  Visual Studio (The Express edition is free), Windows SDK (free), the debugger SDK (free with Debugging Tools Install), and a little knowledge of how Windows works.

Let’s take a moment and think about what our ideal debug application will and won’t do.  

1.     It will be easy to use and configure and use.

2.     It will not break or negatively impact our operating system.  That is to say, it will not use much CPU or resources.

3.     It will wait quietly for our desired condition (in this case a hung window) to manifest.

4.     It will spring into action and gather the critical information about the state of our misbehaving application by creating a dump file without raising a fuss.

5.     It will be multi user aware and not place dumps in insecure locations, this means the dumps will go in the user’s temp directory.

6.     We will only collect a limited number of dump files so we do not fill up the hard drive.

7.     It will notify the admin of a hang and dump event by putting a message in the event log. 

8.     It will execute an optional binary when a hang is detected.

 

Here are the details of how it will work.

To keep things simple we will just create a console application.  The application will be called dumphungwindow.exe.  We will run in a loop until we collect the desired number of dump files.  We will wake up every so many seconds,  get the top most window, loop through each window sending it a message with the SendMessageTimeout  API and if any window takes more then what we signify as our timeout we will create a dump of that process and log an event in the event log.   

I have the sample dumphungwindow.zip and badwindow.zip embedded within it available for download here, it has the EXEs and the visual studio 2005 project with all of the source.  The tool project is called dumphungwindow, and the test application is in a project called badwindow.  This project contains a lab with three different hang scenarios that cause a window to stop responding. 

The command line options are as follows.

C:\source\dumphungwindow\debug>dumphungwindow.exe /?
 This sample application shows you how to use the debugger
 help api to dump an application if it stop responding.

 This tool depends on dbghelp.dll, this comes with the Microsoft debugger tools on www.microsoft.com

 Please make sure you have the debugger tools installed before running this tool.
 This tool is based on sample source code and is provided as is without warranty.

 feel free to contact jeffda@microsoft.com to provide feedback on this sample application

 /m[Number] Default is 5 dumps

 The max number of dumps to take of hung windows before exiting.

 /t[Seconds]  Default is 5 seconds

 The number of seconds a window must hang before dumping it.

 /p[Seconds] Default is 0 seconds

 The number of seconds to pause when dumping before continuing scan.

 /s[Seconds] Default is 5 seconds.

 The scan interval in seconds to wait before rescanning all windows.

 /d[DUMP_FILE_PATH] The default is the SystemRoot folder

 The path or location to place the dump files.

 /e[EXECUTABLE NAME] This allows you to start another program if an application hangs

To run the tool simply start dumphungwindow.exe  The output should look something like this.

C:\source\dumphungwindow\debug>dumphungwindow.exe
Dumps will be saved in C:\Users\jeff\AppData\Local\Temp\
scanning for hung windows

****

To start our bad application extract the badwindowapp.zip file contained in the dumphungwindows.zip

 

Then run badwindow.exe and from the menu select hang \ hang type 2.

 

After a few seconds findhungwindow should detect the unresponsive badwindow.exe and generate a dump.

 

Hung Window found dumping process (7064) badwindow.exe

Dumping unresponsive process

C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day5_29_2007_Time10_36_38_Pid7064_badwindow.exe.dmp

 

 

Please take a moment and review the source.  I’ve added comments that explain how we go about finding the hung window, and how we go about dumping it to a dump file you can open in windbg..

 

Feel free to download and try out dumphungwindow against the badwindow.exe application.  Try looking at “hang type 1” first as that will be my next blog.  Over the coming weeks I’ll be writing about hang types 1,2 and 3 in the badwindow.exe application.   Once you have the dump file you can open it by inside of windbg via file \ open crash dump.  See Debugging tools for the install location.

 

I hope you find this tool and sample helpful.

 

Thank you Jeff-

 

 

/********************************************************************************************************************

Warranty Disclaimer

--------------------------

This sample code, utilities, and documentation are provided as is, without warranty of any kind. Microsoft further disclaims all

implied warranties including without limitation any implied warranties of merchantability or of fitness for a particular  purpose.

The entire risk arising out of the use or performance of the product and documentation remains with you.

 

In no event shall Microsoft be liable for any damages whatsoever  (including, without limitation, damages for loss of business

profits, business interruption, loss of business information, or other pecuniary loss) arising out of the use of or inability to

use the sample code, utilities, or documentation, even if  Microsoft has been advised of the possibility of such damages.

Because some states do not allow the exclusion or limitation of liability for consequential or incidental damages, the above

limitation may not apply to you.

 

********************************************************************************************************************/

 

#include <stdio.h>

#include <windows.h>

#include <dbghelp.h>

#include <psapi.h>

 

// don't warn about old school strcpy etc.

#pragma warning( disable : 4996 )

 

int iMaxDump=5;

int iDumpsTaken=0;

int iHangTime=5000;

int iDumpPause=1;

int iScanRate=5000;

HANDLE hEventLog;

char * szDumpLocation;

int FindHungWindows(void);

char * szDumpFileName = 0;

char * szEventInfo = 0;

char * szDumpFinalTarget = 0;

char * szModName = 0;

char * szAppname = 0;

DWORD dwExecOnHang = 0;

 

#define MAXDUMPFILENAME 1000

#define MAXEVENTINFO 5000

#define MAXDUMPFINALTARGET 2000

#define MAXDUMPLOCATION 1000

#define MAXAPPPATH 1000

#define MAXMODFILENAME 500

#define HMODSIZE 255

 

int main(int argc, char * argv[])

{

      int i;

      int z;

      size_t j;

      char scan;

 

      // check to make sure we have dbghelp.dll on the machine.

      if(!LoadLibrary("dbghelp.dll"))

      {

            printf("dbghelp.dll not found please install the debugger tools and place this tool in \r\nthe debugging tools directory or a copy of dbghelp.dll in this tools directory\r\n");

            return 0;

      }

 

      // Allocate a buffer for our dump location

      szDumpLocation = (char *)malloc(MAXDUMPLOCATION);

      {

            if(!szDumpLocation)

            {

            printf("Failed to alloc buffer for szdumplocation %d",GetLastError());

            return 0;

            }

      }

 

      szAppname = (char *)malloc(MAXAPPPATH);

      {

            if(!szAppname)

            {

            printf("Failed to alloc buffer for szAppname  %d",GetLastError());

            return 0;

            }

      }

 

      // We use temp path because if we are running under terminal server sessions we want the dump to go to each

      // users secure location, ie. there private temp dir. 

      GetTempPath(MAXDUMPLOCATION, szDumpLocation );

     

      for (z=0;z<argc;z++)

      {

            switch(argv[z][1])

            {

            case '?':

                  {

                  printf("\n This sample application shows you how to use the debugger \r\n help api to dump an application if it stop responding.\r\n\r\n");

                  printf("\n This tool depends on dbghelp.dll, this comes with the Microsoft debugger tools on www.microsoft.com");

                  printf("\n Please make sure you have the debugger tools installed before running this tool.");

                  printf("\n This tool is based on sample source code and is provided as is without warranty.");

                  printf("\n feel free to contact jeffda@microsoft.com to provide feedback on this sample application\r\n\r\n");

                  printf(" /m[Number] Default is 5 dumps\r\n The max number of dumps to take of hung windows before exiting.\r\n\r\n");

                  printf(" /t[Seconds]  Default is 5 seconds\r\n The number of seconds a window must hang before dumping it. \r\n\r\n");

                  printf(" /p[Seconds] Default is 0 seconds\r\n The number of seconds to pause when dumping before continuing scan. \r\n\r\n");

                  printf(" /s[Seconds] Default is 5 seconds.\r\n The scan interval in seconds to wait before rescanning all windows.\r\n\r\n");

                  printf(" /d[DUMP_FILE_PATH] The default is the SystemRoot folder\r\n The path or location to place the dump files.  \r\n\r\n");

                  printf(" /e[EXECUTABLE NAME] This allows you to start another program if an application hangs\r\n\r\n");

 

                  return 0;

                  }

            case 'm':

            case 'M':

                  {

                        iMaxDump = atoi(&argv[z][2]);

                        break;

                  }

            case 't':

            case 'T':

                  {

                        iHangTime= atoi(&argv[z][2]);

                        iHangTime*=1000;

                        break;

                  }

            case 'p':

            case 'P':

                  {

                        iDumpPause= atoi(&argv[z][2]);

                        iDumpPause*=1000;

                        break;           

                  }

            case 's':

            case 'S':

                  {

                        iScanRate = atoi(&argv[z][2]);

                        iScanRate*=1000;             

                        break;

                  }

            case 'd':

            case 'D':

                  { // Dump file directory path

                        strcpy(szDumpLocation,&argv[z][2]);

                        j = strlen(szDumpLocation);

 

                        if (szDumpLocation[j-1]!='\\')

                        {

                              szDumpLocation[j]='\\';

                              szDumpLocation[j+1]=NULL;

                        }

                        break;

                  }

            case 'e':

            case 'E':

                  { // applicaiton path to exec if hang happens

                        strcpy(szAppname,&argv[z][2]);

                        dwExecOnHang = 1;

                        break;

                  }

            }

      }

 

 

      printf("Dumps will be saved in %s\r\n",szDumpLocation);

      puts("scanning for hung windows\n");

 

      hEventLog = OpenEventLog(NULL, "HungWindowDump");

 

      i=0;

      scan='*';

      while(1)

      {

            if(i>20)

            {

                  if ('*'==scan)

                  {

                  scan='.';

            }

                  else

                  {

                  scan='*';

            }

                  printf("\r");

            i=0;

            }

            i++;

            putchar(scan);

            if(!FindHungWindows())

            {

                  return 0;

            }

            if (iMaxDump == iDumpsTaken)

            {

                  printf("\r\n%d Dumps taken, exiting\r\n",iDumpsTaken);

                  return 0;

            }

            Sleep(iScanRate);

      }

 

      free(szDumpLocation);

      return 0;

}

 

int FindHungWindows(void)

{

DWORD dwResult = 0;

DWORD ProcessId = 0;

DWORD tid = 0;

DWORD dwEventInfoSize = 0;

 

// Handles

HWND hwnd = 0;

HANDLE hDumpFile = 0;

HANDLE hProcess = 0;

HRESULT hdDump = 0;

 

SYSTEMTIME SystemTime;

MINIDUMP_TYPE dumptype = (MINIDUMP_TYPE) (MiniDumpWithFullMemory | MiniDumpWithHandleData | MiniDumpWithUnloadedModules | MiniDumpWithProcessThreadData);

 

// These buffers are presistant.

 

// security stuff to report the SID of the dumper to the event log.

PTOKEN_USER pInstTokenUser;

HANDLE ProcessToken;

TOKEN_INFORMATION_CLASS TokenInformationClass = TokenUser;

DWORD ReturnLength =0;

 

// This allows us to get the first window in the chain of top windows.

hwnd = GetTopWindow(NULL);

if(!hwnd)

{

      printf("Could not GetTopWindow\r\n");

      return 0;

}

 

// We will iterate through all windows until we get to the end of the list.

while(hwnd)

{

      // Get the process ID for the current window   

      tid = GetWindowThreadProcessId(hwnd, &ProcessId);

 

      // Sent a message to this window with our timeout. 

      // If it times out we consider the window hung

      if (!SendMessageTimeout(hwnd, WM_NULL, 0, 0, SMTO_BLOCK, iHangTime, &dwResult))

      {

            // SentMessageTimeout can fail for other reasons, 

            // if it's not a timeout we exit try again later

            if(ERROR_TIMEOUT != GetLastError())

            {

                  printf("SendMessageTimeout has failed with error %d\r\n",GetLastError());

                  return 1;

            }

                  // Iint our static buffers points.

                  // On our first trip through if we have not

                  // malloced memory for our buffers do so now.

                  if(!szModName)

                  {

                        szModName = (char *)malloc(MAXMODFILENAME);

                        {

                              if(!szModName)

                              {

                              printf("Failed to alloc buffer for szModName %d",GetLastError());

                              return 0;

                              }

                        }

                  }

                  if(!szDumpFileName)// first time through malloc a buffer.

                  {

                        szDumpFileName = (char *)malloc(MAXDUMPFINALTARGET);

                        {

                              if(!szDumpFileName)

                              {

                                    printf("Failed to alloc buffer for dumpfilename %d",GetLastError());

                                    return 0;

                              }

                        }

                  }

                  if(!szDumpFinalTarget)// first time through malloc a buffer.

                  {

                        szDumpFinalTarget= (char *)malloc(MAXDUMPFINALTARGET);

                        {

                              if(!szDumpFinalTarget)

                              {

                              printf("Failed to alloc buffer for dumpfiledirectory %d",GetLastError());

                              return 0;

                              }

                        }

                  }

                  if(!szEventInfo)

                  {

                        szEventInfo= (char *)malloc(MAXEVENTINFO);

                        {

                              if(!szEventInfo)

                              {

                              printf("Failed to alloc buffer for szEventInfo %d",GetLastError());

                              return 0;

                              }

                        }

                  }

                  // End of initial buffer allocations.

 

            GetLocalTime (&SystemTime);

           

            // Using the process id we open the process for various tasks.

            hProcess = OpenProcess(PROCESS_ALL_ACCESS,NULL,ProcessId);

            if(!hProcess )

            {

                  printf("Open process of hung window failed with error %d\r\n",GetLastError());

                  return 1;

            }

            // What is the name of the executable?

            GetModuleBaseName( hProcess, NULL, szModName,MAXMODFILENAME);

 

            printf("\r\n\r\nHung Window found dumping process (%d) %s\n",ProcessId,szModName);

 

            // Here we build the dump file name time, date, pid and binary name

            sprintf(szDumpFileName,"HWNDDump_Day%d_%d_%d_Time%d_%d_%d_Pid%d_%s.dmp",SystemTime.wMonth,SystemTime.wDay,SystemTime.wYear,SystemTime.wHour,SystemTime.wMinute,SystemTime.wSecond,ProcessId,szModName);

            strcpy(szDumpFinalTarget,szDumpLocation);

            strcat(szDumpFinalTarget,szDumpFileName);

 

            // We have to create the file and then pass it's handle to the dump api

            hDumpFile = CreateFile(szDumpFinalTarget,FILE_ALL_ACCESS,0,NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);

            if(!hDumpFile)

            {

                  printf("CreateFile failed to open dump file at location %s, with error %d\r\n",szDumpLocation,GetLastError());

                  return 0;

            }

 

            printf("Dumping unresponsive process\r\n%s",szDumpFinalTarget);

           

            // This dump api will halt the target process while it writes it's

            // image to disk in the form a dump file.

            // this can be opened later by windbg or cdb for debugging.

            if(!MiniDumpWriteDump(hProcess,ProcessId,hDumpFile,dumptype ,NULL,NULL,NULL))

            {

                  // We do this on failure

                  hdDump = HRESULT_FROM_WIN32(GetLastError());

                  printf("MiniDumpWriteDump failed with a hresult of %d last error %d\r\n",hdDump,GetLastError());

                  CloseHandle (hDumpFile);

                  return 0;

            }

            else

            {

                  // If we are here the dump worked.  Now we need to notify the machine admin by putting a event in

                  // the application event log so someone knows a dump was taken and where it is stored.

                  sprintf(szEventInfo,"An application hang was caught by findhungwind.exe, the process was dumped to %s",szDumpFinalTarget);

 

                  // We need to get the process token so we can get the user sit so ReportEvent will have the

                  // User name / account in the event log.

                  if (OpenProcessToken(hProcess,      TOKEN_QUERY,&ProcessToken ) )

                  {

                        // Make the firt call to findout how big the sid needs to be.    

                        GetTokenInformation(ProcessToken,TokenInformationClass, NULL,NULL,&ReturnLength);

                        pInstTokenUser = (PTOKEN_USER) malloc(ReturnLength);

                        if(!pInstTokenUser)

                        {

                              printf("Failed to malloc buffer for InstTokenUser exiting error %d\r\n",GetLastError());

                              return 0;

                        }

                        if(!GetTokenInformation(ProcessToken,TokenInformationClass, (VOID *)pInstTokenUser,ReturnLength,&ReturnLength))

                        {

                              printf("GetTokenInformation failed with error %d\r\n",GetLastError());

                              return 0;

                        }

                  }

                  // write the application event log message. 

                  // This will show up as source DumpHungWindow

                  dwEventInfoSize=(DWORD)strlen(szEventInfo);

     

                  ReportEvent(hEventLog,EVENTLOG_WARNING_TYPE,1,1,pInstTokenUser->User.Sid,NULL,dwEventInfoSize,NULL,szEventInfo);

 

                  // Free to token buffer, we don't want to leak anything.

                  free(pInstTokenUser);

                 

                  // In additon to leaking a handle if you don't close the handle

                  // you may not get the dump to flush to the hard drive.

                  CloseHandle (hDumpFile);

                  printf("\r\nDump complete");

                 

                  // This allows you to execute something if you get a hang like crash.exe

                  if (dwExecOnHang)

                  {

                        system(szAppname);

                  }

                 

                  //  The Sleep is here so in the event you want to wait N seconds

                  //  before collecting another dump

                  //  you can pause.  This is helpful if you want to see if any

                  //  forward progress is happening over time

                 

                  Sleep(iDumpPause);

            }

            // Once we are at our threadshold for max dumps

            // we exit so we do not fill up the hard drive.

            iDumpsTaken++;

            if (iMaxDump == iDumpsTaken)

            {

                  return 0;

            }

        }

        // This is where we traverse to the next window.

            hwnd = GetNextWindow(hwnd, GW_HWNDNEXT);

      }

      return 1;

}

 

Leave a Comment
  • Please add 6 and 7 and type the answer here:
  • Post
  • After running badwindow.exe and selecting / hang type 2, my system is hung ;^) and I can't break out without a reset.  Did I miss something?  Is there a way to kill badwindow?

    [For a detailed description of what scenario two does please review the following blog entry.

    http://blogs.msdn.com/ntdebugging/archive/2007/06/15/hung-window-no-source-no-problem-part-2.aspx

    Hang type two should only hang the application badwindow.exe and should not be able to hang the entire OS. Is it possible that you have the window maximized? This being the case, nothing in badwindow.exe would respond to a click. This scenario is simply a blocked critical section. A critical sections in this context should only effect the scope of one application.

    Thank you, Jeff Dailey Platforms Escaltion Engineer. ]
  • Written by Jeff Dailey: As a debugger, have you ever reflected on the interesting parallels between your

  • This is really helpful. Thanks for writing so well and so much about the nitty gritty of debugging hung computers.

  • I'm not sure I understand that nature of the following question.  Can you please elaborate?

    Jeff-

    -------------------------------------------------------------

     

    I have a problem with analysing the dumps, because of the way the failing executable is located: it must be in the same folder as on the customer's system.  For example, today I received a dump from Germany, and had to copy my application files to the folder "C:\Programme\<My app folder>", before I could use the dump to debug it.  To save having to keep doing this, is there any way of telling Visual Studio 2008 where to locate the EXE on my system?

  • Really useful. And I, [as an old fashioned software writer] particularly appreciated the fact that the sample is written in plain C, instead of piling up with lots of obscure [to me] C++ classes :-)

  • Just a couple of tiny nit-picks:

    You have findhungwind in the sprintf; looks like the application started life as a different name...

    In the error handling of the second GetTokenInformation call, you didn't do a 'free(pInstTokenUser);'.

    ;-)

    Overall, great application, great programming style and great approach. It is a great example of how a couple of API calls can do so much.

    Andrew Richards

  • &quot;이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며

  • &quot;이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며

  • &quot;이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며

  • 이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며 의견을 주시기

  • I am unable to download the lab.  Is the link broken?

    [Sorry Michael, this is no longer available.  The source is given so you should be able to build dumphungwindow.exe.  You can probably create a badwindow app by putting a loop or a sleep call in the message loop.]

Page 1 of 1 (11 items)