Hung Window?, No Source?, No Problem!! Part 2

Hung Window?, No Source?, No Problem!! Part 2

  • Comments 4

Written by Jeff Dailey

 

Hello, my name is Jeff, I’m a escalation engineer on the Microsoft CPR (critical problem resolution) platforms team.   This blog entry is part 2 of my Hung Window?, No source?, No problem!! Part 1 blog.   In this lab we will be debugging a problem involving multi threaded applications and synchronization objects and the types of things that can go wrong, and how to track them down. This process and training lab is right out of our CPR Training curriculum.  In order to do the lab I have prepared for you, you will need to have downloaded the dumphungwindow and then badwindow.exe from my earlier blog post.  You will also need to install the debugging tools for windows.

 

Debugging tools:

 http://www.microsoft.com/whdc/devtools/debugging/installx86.mspx

Previous blog http://blogs.msdn.com/ntdebugging/archive/2007/05/29/detecting-and-automatically-dumping-hung-gui-based-windows-applications.aspx

 

After you have both of these installed we can get started.  We are going to debug and figure out why the window stops repainting and does not respond.

 

Step 1 start badwindow.exe

Step 2 run dumphungwindow.exe

Step 3 select Hang \ Hang Type 2 from  the BadWindow.exe menu.

You should see dump hung window detect your window no processing messages and as a result it will dump the badwindow.exe process

 

************ OUTPUT *************

C:\source\dumphungwindow\release>dumphungwindow.exe
Dumps will be saved in C:\Users\jeffda\AppData\Local\Temp\
scanning for hung windows

**

Hung Window found dumping process (12912) badwindow.exe
Dumping unresponsive process
C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day6_14_2007_Time7_34_5_Pid12912_badwindow.exe.dmp

Dump complete

 

Hung Window found dumping process (12912) badwindow.exe

Dumping unresponsive process
C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day6_14_2007_Time7_34_24_Pid12912_badwindow.exe.dmp\jeffda\AppData\Local\Temp\HWNDDump_Day6_12_2007_Time9_53_56_Pid7924_badwindow.exe.dmp

Dump complete*

************ OUTPUT *************

 

Step 4 create a local symbol directory at C:\websymbols

Step 5 set your symbol path under file \ symbols in windbg to SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols

See http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx for details.


Step 6 start windbg select file\open crash dump and select the first dump file.

Your initial output should look like this.

 

Microsoft (R) Windows Debugger  Version 6.7.0001.0

Copyright (c) Microsoft Corporation. All rights reserved.

 

***** WARNING: Your debugger is probably out-of-date.

*****          Check http://dbg for updates.

 

Loading Dump File [C:\Users\jeffda\AppData\Local\Temp\HWNDDump_Day6_12_2007_Time9_53_34_Pid7924_badwindow.exe.dmp]

User Mini Dump File with Full Memory: Only application data is available

 

Symbol search path is: SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols;srv

Executable search path is:

Windows Vista Version 6000 MP (2 procs) Free x86 compatible

Product: WinNt, suite: SingleUserTS

Debug session time: Tue Jun 12 09:53:35.000 2007 (GMT-4)

System Uptime: 11 days 18:41:43.089

Process Uptime: 0 days 0:00:32.000

....................................

Loading unloaded module list

.

eax=00000000 ebx=00000002 ecx=00000000 edx=00000000 esi=00000000 edi=00000000
eip=777faec5 esp=0017faf4 ebp=0017fb8c iopl=0         nv up ei pl nz na po nc
cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202
ntdll!ZwWaitForMultipleObjects+0x15:
777faec5 c21400          ret     14h

0:000> !reload

 

Step 7 from the debugger prompt (Locate a prompt at the bottom of windbg that has a 0:000> next to it.
Type ~* k

 

 

You will most likely see output similar to this.

.  0  Id: 3270.2b10 Suspend: 0 Teb: 7efdd000 Unfrozen

ChildEBP RetAddr 

0017faf0 76e4edb5 ntdll!ZwWaitForMultipleObjects+0x15

0017fb8c 76e430c3 kernel32!WaitForMultipleObjectsEx+0x11d

0017fba8 00401502 kernel32!WaitForMultipleObjects+0x18

0017fbc8 0040139b badwindow!hangtype2+0x42 [c:\source\badwindow\badwindow\badwindow.cpp @ 340]

0017fc24 772a87af badwindow!WndProc+0x17b [c:\source\badwindow\badwindow\badwindow.cpp @ 274]

0017fc50 772a8936 user32!InternalCallWinProc+0x23

0017fcc8 772a8a7d user32!UserCallWinProcCheckWow+0x109

0017fd2c 772a8ad0 user32!DispatchMessageWorker+0x380

0017fd3c 004010fb user32!DispatchMessageW+0xf

0017ff0c 00401817 badwindow!wWinMain+0xfb [c:\source\badwindow\badwindow\badwindow.cpp @ 124]

0017ffa0 76eb19f1 badwindow!__tmainCRTStartup+0x150 [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 589]

0017ffac 7782d109 kernel32!BaseThreadInitThunk+0xe

0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23

 

   1  Id: 3270.2cd0 Suspend: 0 Teb: 7efda000 Unfrozen

ChildEBP RetAddr 

026ffebc 777ecfad ntdll!ZwWaitForSingleObject+0x15

026fff20 777ecf78 ntdll!RtlpWaitOnCriticalSection+0x154

026fff48 0040153c ntdll!RtlEnterCriticalSection+0x152

026fff64 757c2848 badwindow!hangtype2threada+0x2c [c:\source\badwindow\badwindow\badwindow.cpp @ 358]

026fff9c 757c28c8 msvcr80!_endthread+0x4b

026fffa0 76eb19f1 msvcr80!_endthread+0xcb

026fffac 7782d109 kernel32!BaseThreadInitThunk+0xe

026fffec 00000000 ntdll!_RtlUserThreadStart+0x23

Note the [NUMBER] Id: indicates the thread number, to the right of this you have the process id and thread id > PROCESS 3270.2b10 < THREAD | THREAD STATE > Suspend: 0 Teb: 7efdd000 Unfrozen

 

Each of these threads represents a call stack.  The most recent call is at the TOP of the stack.  As each call is made the stack grows larger.   Looking at thread 0 you will see that our winproc appears to be blocked on a call to hangtype2, hangtype2 is making a call to WaitForMultipleObjects Lets look more closely at WaitForMultipleObjects

 

Docs for WaitForMultipleObjects

http://msdn2.microsoft.com/en-us/library/ms687025.aspx

 

DWORD WINAPI WaitForMultipleObjects( DWORD nCount, const HANDLE* lpHandles, BOOL bWaitAll, DWORD dwMilliseconds

 

Lets look at the parameters passed to

 

0:000> kv

ChildEBP RetAddr  Args to Child             

0017faf0 76e4edb5 00000002 0017fb40 00000000 ntdll!ZwWaitForMultipleObjects+0x15 (FPO: [5,0,0])

0017fb8c 76e430c3 0017fb40 0017fbc4 00000001 kernel32!WaitForMultipleObjectsEx+0x11d (FPO: [Non-Fpo])

0017fba8 00401502 00000002 0017fbc4 00000001 kernel32!WaitForMultipleObjects+0x18 (FPO: [Non-Fpo])

0017fbc8 0040139b 00401220 0017fbfc 00401220 badwindow!hangtype2+0x42 (FPO: [0,2,0]) (CONV: cdecl) [c:\source\badwindow\badwindow\badwindow.cpp @ 340]

0017fc24 772a87af 00063d36 00000111 00008004 badwindow!WndProc+0x17b (CONV: stdcall) [c:\source\badwindow\badwindow\badwindow.cpp @ 274]

0017fc50 772a8936 00401220 00063d36 00000111 user32!InternalCallWinProc+0x23

0017fcc8 772a8a7d 00000000 00401220 00063d36 user32!UserCallWinProcCheckWow+0x109 (FPO: [Non-Fpo])

0017fd2c 772a8ad0 00401220 00000000 0017ff0c user32!DispatchMessageWorker+0x380 (FPO: [Non-Fpo])

0017fd3c 004010fb 0017fd54 00403938 00000001 user32!DispatchMessageW+0xf (FPO: [Non-Fpo])

0017ff0c 00401817 00400000 00000000 00280f8c badwindow!wWinMain+0xfb (CONV: stdcall) [c:\source\badwindow\badwindow\badwindow.cpp @ 124]

0017ffa0 76eb19f1 7efde000 0017ffec 7782d109 badwindow!__tmainCRTStartup+0x150 (FPO: [Non-Fpo]) (CONV: cdecl) [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 589]

0017ffac 7782d109 7efde000 0017fb9e 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])

0017ffec 00000000 00401987 7efde000 00000000 ntdll!_RtlUserThreadStart+0x23 (FPO: [Non-Fpo])

 

The first parameter is 00000002  this is the number of objects we are waiting on.

The second parmeter is the address of the array of objects,  Lets dump it out and take a look at the objects

 

0:000> dd 0017fbc4

0017fbc4  000000c4 000000c8 0040139b 00401220

0017fbd4  0017fbfc 00401220 00063d36 0017fc48

0017fbe4  772a8989 772a894d 53ca28e7 00000000

0017fbf4  00063d36 00401220 00000000 00000000

0017fc04  00000000 0017fca0 00000001 00000000

0017fc14  ffffffff 772a88e5 53c4f4b4 75c12459

0017fc24  0017fc50 772a87af 00063d36 00000111

0017fc34  00008004 00000000 00401220 dcbaabcd

 

0:000> !handle 000000c4

Handle 000000c4

  Type            Thread

0:000> !handle 000000c8

Handle 000000c8

  Type            <Error retrieving type>

 

Looking at the second value it would appear as if all the info needed to get the handle type info is not in the dump for some reason.  Handles are a index into the handle table in kernel.  It’s possible when the dump was created that no all the handle info was included.  However that’s ok.  We have a simple way to work around this and see what happened.

 

We can use UF from part 1 of this blogs on badwindow.exe, All we need to do is UF the return address of

WaitForMultipleObjects.  Lets run through the assembly and see what we are waiting on.

 

0:000> uf 00401502

badwindow!hangtype2 [c:\source\badwindow\badwindow\badwindow.cpp @ 334]:

 

Reserving space on the stack by decrementing ESP (The stack pointer, remember the stack grows down in memory)

  334 004014c0 83ec08          sub     esp,8

 

Save the state of ESI so it can be restored later.

  334 004014c3 56              push    esi

 

Get the pointer to _beginthread from the import table and store it in ESI
Docs on being thread http://msdn2.microsoft.com/en-us/library/kdzttdcb(VS.80).aspx 1 (start address), 2 (stack size), 3 (arglist)

  337 004014c4 8b3580204000    mov     esi,dword ptr [badwindow!_imp___beginthread (00402080)]

 

 

Push the last arg to _beginthread on the stack this is the arg list for _beginthread in this case 0 we are passing no args.

  337 004014ca 6a00            push    0


This is our stack space.  Note in the debugger you can do a ? 2EE0h and it will show value in Hex and Dec, this value is 12000 dec.

  337 004014cc 68e02e0000      push    2EE0h

 

This is the start address for our thread function, in this case hangtype2threada

  337 004014d1 6810154000      push    offset badwindow!hangtype2threada (00401510)

 

Here we call _beginthread this starts the thread up.  The return value is a thread handle.

  337 004014d6 ffd6            call    esi

 

Push the last arg to _beginthread on the stack this is the arg list for _beginthread in this case 0 we are passing no args.

  338 004014d8 6a00            push    0

 

This is our stack space arg for _beingthread

  338 004014da 68e02e0000      push    2EE0h

 

This is the start address for our thread function, in this case hangtype2threadb

  338 004014df 6870154000      push    offset badwindow!hangtype2threadb (00401570)

 

We are now storing EAX (The return from the first _beginthreadcall ) into ESP+1ch (on our stack)

  338 004014e4 8944241c        mov     dword ptr [esp+1Ch],eax

 

Here we call _beginthread this starts the thread up.  The return value is a thread handle.

  338 004014e8 ffd6            call    esi

 

Any time we add to ESP We are shrinking or cleaning up the stack.

  338 004014ea 83c418          add     esp,18h

 

We are pushing our wait time for WaitForMultipleObjects in this case 0FFFFFFFFh (-1) Wait forever.

  340 004014ed 6aff            push    0FFFFFFFFh

 

Storing EAX on the stack, this is the thread handle from our last _beginthread call.

  340 004014ef 8944240c        mov     dword ptr [esp+0Ch],eax

 

This is our wait logic,  in this case it’s WaitAll, so we will only unblock once all handles are signaled or in this case threads complete running.

  340 004014f3 6a01            push    1

 

Here we are loading the pointer of the stack location that contains our handles that we will wait on into EAX.

  340 004014f5 8d44240c        lea     eax,[esp+0Ch]

 

Now we push the pointer to our handles / objects on the to the stack.

  340 004014f9 50              push    eax

 

And this is the count of objects, 2 in this case both of them threads.

  340 004014fa 6a02            push    2

 

Now we call our WaitForMultipleObjects call to wait on hangtype2threadb and hangtype2threada to finish executing.

  340 004014fc ff1510204000    call    dword ptr [badwindow!_imp__WaitForMultipleObjects (00402010)]

 

Restore our ESI register, this will happen when we return. 

  340 00401502 5e              pop     esi 

 

Dec our stack pointer.

  342 00401503 83c408          add     esp,8

 

Return we are done.

  342 00401506 c3              ret

 

Here is the source.

 

void hangtype2(void)

{

      HANDLE handles[2];

 

      handles[0] = (HANDLE)_beginthread(hangtype2threada, 12000, NULL);

      handles[1] = (HANDLE)_beginthread(hangtype2threadb, 12000, NULL);

     

      WaitForMultipleObjects(2,handles,1,INFINITE);

 

}

 

 

So what went wrong?  Let’s look at our threads again. 

 

We have our main message pump thread thread 0 waiting on two threads, One is still running badwindow!hangtype2threada and the other one is gone or has completed hangtype2threadb.

 

0  Id: 3270.2b10 Suspend: 0 Teb: 7efdd000 Unfrozen

ChildEBP RetAddr 

0017faf0 76e4edb5 ntdll!ZwWaitForMultipleObjects+0x15

0017fb8c 76e430c3 kernel32!WaitForMultipleObjectsEx+0x11d

0017fba8 00401502 kernel32!WaitForMultipleObjects+0x18

0017fbc8 0040139b badwindow!hangtype2+0x42 [c:\source\badwindow\badwindow\badwindow.cpp @ 340]

0017fc24 772a87af badwindow!WndProc+0x17b [c:\source\badwindow\badwindow\badwindow.cpp @ 274]

0017fc50 772a8936 user32!InternalCallWinProc+0x23

0017fcc8 772a8a7d user32!UserCallWinProcCheckWow+0x109

0017fd2c 772a8ad0 user32!DispatchMessageWorker+0x380

0017fd3c 004010fb user32!DispatchMessageW+0xf

0017ff0c 00401817 badwindow!wWinMain+0xfb [c:\source\badwindow\badwindow\badwindow.cpp @ 124]

0017ffa0 76eb19f1 badwindow!__tmainCRTStartup+0x150 [f:\sp\vctools\crt_bld\self_x86\crt\src\crtexe.c @ 589]

0017ffac 7782d109 kernel32!BaseThreadInitThunk+0xe

0017ffec 00000000 ntdll!_RtlUserThreadStart+0x23

 

Looking at hangtype2threada it would seem that it is blocked on RtlEnterCriticalSection.

 

   1  Id: 3270.2cd0 Suspend: 0 Teb: 7efda000 Unfrozen

ChildEBP RetAddr 

026ffebc 777ecfad ntdll!ZwWaitForSingleObject+0x15

026fff20 777ecf78 ntdll!RtlpWaitOnCriticalSection+0x154

026fff48 0040153c ntdll!RtlEnterCriticalSection+0x152

026fff64 757c2848 badwindow!hangtype2threada+0x2c [c:\source\badwindow\badwindow\badwindow.cpp @ 358]

026fff9c 757c28c8 msvcr80!_endthread+0x4b

026fffa0 76eb19f1 msvcr80!_endthread+0xcb

026fffac 7782d109 kernel32!BaseThreadInitThunk+0xe

026fffec 00000000 ntdll!_RtlUserThreadStart+0x23

 

Lets look and see what is happening with this critical section call..

 

First lets set our thread context to thread id 1

 

0:000> ~1s

eax=00000000 ebx=00000000 ecx=00000000 edx=00000000 esi=00403780 edi=00000000

eip=777fa69d esp=026ffec0 ebp=026fff20 iopl=0         nv up ei pl nz na po nc

cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000202

ntdll!ZwWaitForSingleObject+0x15:

777fa69d c20c00          ret     0Ch

 

Lets get our call stack and get the first and only arg for entercriticalsection.

 

0:001> kv

ChildEBP RetAddr  Args to Child              

026ffebc 777ecfad 000000cc 00000000 00000000 ntdll!ZwWaitForSingleObject+0x15 (FPO: [3,0,0])

026fff20 777ecf78 00000000 00000000 76e61d5a ntdll!RtlpWaitOnCriticalSection+0x154 (FPO: [Non-Fpo])

026fff48 0040153c 00403780 00000000 00000000 ntdll!RtlEnterCriticalSection+0x152 (FPO: [Non-Fpo])

026fff64 757c2848 00000000 51b22bb2 00000000 badwindow!hangtype2threada+0x2c (FPO: [Uses EBP] [1,1,0]) (CONV: cdecl) [c:\source\badwindow\badwindow\badwindow.cpp @ 358]

026fff9c 757c28c8 76eb19f1 02274358 026fffec msvcr80!_endthread+0x4b (FPO: [Non-Fpo])

026fffa0 76eb19f1 02274358 026fffec 7782d109 msvcr80!_endthread+0xcb (FPO: [Non-Fpo])

026fffac 7782d109 02274358 026ffb9e 00000000 kernel32!BaseThreadInitThunk+0xe (FPO: [Non-Fpo])

026fffec 00000000 757c286e 02274358 00000000 ntdll!_RtlUserThreadStart+0x23 (FPO: [Non-Fpo])

 

We can do a couple of things at this point first lets look at the CS. (Critical Section)

 

0:001> !cs 00403780

-----------------------------------------

Critical section   = 0x00403780 (badwindow!csCritSec1+0x0)

DebugInfo          = 0x0029bd40

LOCKED             < It’s LOCKED.

LockCount          = 0x1

WaiterWoken        = No

OwningThread       = 0x00002048  < This is the owning thread.

RecursionCount     = 0x14

LockSemaphore      = 0xCC

SpinCount          = 0x00000000

 

Are there any other locked critical sections?  !locks will tell us and no this is the only one.

 

0:001> !locks

 

CritSec badwindow!csCritSec1+0 at 00403780

WaiterWoken        No

LockCount          1

RecursionCount     20

OwningThread       2048

EntryCount         0

ContentionCount    1

*** Locked

 

Scanned 156 critical sections

 

What thread are running in our process and what is 2048 doing?

 

0:001> ~

#  0  Id: 3270.2b10 Suspend: 0 Teb: 7efdd000 Unfrozen

.  1  Id: 3270.2cd0 Suspend: 0 Teb: 7efda000 Unfrozen

 

Ok here is our problem.  Apparently both threads hangtype2threada and hangtype2threadb were using this same critical section however something happened to hangtype2threadb.  We need to figure out what happened so let’s go take a look at that function.

 

Looking back where we unassembled badwindow!hangtype2 we got it’s address, lets verify it with a ln (list near), we are lucky enough to have symbols in this case.

 

0:001> ln 00401570

c:\source\badwindow\badwindow\badwindow.cpp(370)

(00401570)   badwindow!hangtype2threadb   |  (004015e0)   badwindow!hangtype3thread

Exact matches:

    badwindow!hangtype2threadb (void *)

 

Looks like we have an exact match.  Now lets unassemble it and see what went wrong.

 

 

0:001> uf 00401570

badwindow!hangtype2threadb [c:\source\badwindow\badwindow\badwindow.cpp @ 370]:

 

Save ECX

  370 00401570 51              push    ecx

 

Save EBX

  370 00401571 53              push    ebx

 

Move the pointer to sprint into EBX

  371 00401572 8b1d7c204000    mov     ebx,dword ptr [badwindow!_imp__sprintf (0040207c)]

 

Save EBP

  371 00401578 55              push    ebp

 

Move the pointer to outputdebugstring into ebp

  371 00401579 8b2d14204000    mov     ebp,dword ptr [badwindow!_imp__OutputDebugStringA (00402014)]

 

Save ESI

  371 0040157f 56              push    esi

 

Move the pointer to EnterCriticalSection into ESI

  371 00401580 8b351c204000    mov     esi,dword ptr [badwindow!_imp__EnterCriticalSection (0040201c)]

 

Save EDI

  371 00401586 57              push    edi

 

Move the pointer for Sleep into EDI

  371 00401587 8b3d0c204000    mov     edi,dword ptr [badwindow!_imp__Sleep (0040200c)]

 

Save  14h or 20dec to ESP+10h (A local on the stack)  Maybe this is a counter?

  371 0040158d c744241014000000 mov     dword ptr [esp+10h],14h

 

Push the address of the critical section csCritSec1 00403780 onto the stack.

  374 00401595 6880374000      push    offset badwindow!csCritSec1 (00403780)

 

Call entercriticalsection

  374 0040159a ffd6            call    esi

 

Push 0xFA, 250Dec on the stack

  376 0040159c 68fa000000      push    0FAh

Call Sleep (Wait for 250ms)

  376 004015a1 ffd7            call    edi

Push pointer to value on the stack. When in doubt dump it out.

0:001> db 004021e4

004021e4  57 65 20 61 72 65 20 69-6e 20 68 61 6e 67 74 79  We are in hangty

004021f4  70 65 32 74 68 72 65 61-64 62 00 00 48 00 00 00  pe2threadb..H...

00402204  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................

00402214  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................

  377 004015a3 68e4214000      push    offset badwindow!`string' (004021e4)

 

 

Push pointer on the stack what is it?

0:001> db 00403380

00403380  57 65 20 61 72 65 20 69-6e 20 68 61 6e 67 74 79  We are in hangty

00403390  70 65 32 74 68 72 65 61-64 62 00 00 00 00 00 00  pe2threadb......

004033a0  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................

004033b0  00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00  ................

  377 004015a8 6880334000      push    offset badwindow!szTrace (00403380)

 

Call spirintf

  377 004015ad ffd3            call    ebx

 

Clean up the stack

  377 004015af 83c408          add     esp,8

 

This is out buffer we just did the sprintf to

  378 004015b2 6880334000      push    offset badwindow!szTrace (00403380)


Call outputdebugstring

  378 004015b7 ffd5            call    ebp

 

Push csCritSec2’s address on the stack

  380 004015b9 6868374000      push    offset badwindow!csCritSec2 (00403768)

 

Call LeaveCriticalSection for csCritSec2’s

  380 004015be ff1524204000    call    dword ptr [badwindow!_imp__LeaveCriticalSection (00402024)]

 

Decrement a counter on the stack this is a local counting down to zero..

  382 004015c4 836c241001      sub     dword ptr [esp+10h],1

 

Check counter local counting down to zero if we are not ZERO yet dump to the top of the loop.

  382 004015c9 75ca            jne     badwindow!hangtype2threadb+0x25 (00401595)

 

 

Restore all the registers and then return

  382 004015cb 5f              pop     edi

  382 004015cc 5e              pop     esi

  382 004015cd 5d              pop     ebp

  382 004015ce 5b              pop     ebx

  387 004015cf 59              pop     ecx

  387 004015d0 c3              ret

 

 

Did you see the BUG?,  Look closely,   If you need it here is the source.

void __cdecl hangtype2threadb(void *)

{

      int i=0;

      while(1)

      {

            EnterCriticalSection(&csCritSec1);

           

            Sleep(250);

            sprintf(szTrace, "We are in hangtype2threadb");

            OutputDebugStringA(szTrace);

           

            LeaveCriticalSection(&csCritSec2);

            i++;

            if(i==20)

            {

                  break;

            }

      }

}

 

We are entering one critical section and leaving another.  Then we drop out of the function once we dec our counter to zero and the thread terminates leaving csCritSec1 entered but never left.  The fix for this seems rather simple,  we just need to leave critsec1 vis leave creatsec2.  That should fix it.  But it we don’t have the source how can we verify that?

SIMPLE! We just modify the machine code in the debugger!   Often if we think we know how to fix something we will edit the code bytes to make the machine code do the right thing and let it run. 

 

Do do this, from the command line in your debuggers directory run windbg.exe C:\source\badwindow\release\badwindow.exe   asuming you have your bad window sample in the same directory, I do.   When the debugger fires up make sure you have your symbol path set.  .sympath SRV*c:\websymbols*http://msdl.microsoft.com/download/symbols.

 

Our bad funtion call was

 

Push csCritSec2’s address on the stack  << WRONG CRITICALSECTION

  380 004015b9 6868374000      push    offset badwindow!csCritSec2 (00403768)

 

Call LeaveCriticalSection for csCritSec2’s

  380 004015be ff1524204000    call    dword ptr [badwindow!_imp__LeaveCriticalSection (00402024)]

 

Push the address of the critical section csCritSec1 00403780onto the stack.   << CORRECT CRITICALSECTION

  374 00401595 6880374000      push    offset badwindow!csCritSec1 (00403780)

 

Call entercriticalsection

  374 0040159a ffd6            call    esi

 

Remember all we need to do is change what criticalsection was pushed on the stack for the leavecriticalsection call.

 

004015b9 6868374000  (BAD)   
00401595 6880374000  (GOOD)

Now we just do a edit bytes on the bad instruction.

0:001> eb 004015b9

004015b9 68 68  << ENTER 68

68

004015ba 68 80  << We don’t want 68 enter 80

80

004015bb 37  << Now just hit enter to finish editing memory.

 

Here is the fixed code.

 

0:001> uf 00401570

badwindow!hangtype2threadb [c:\source\badwindow\badwindow\badwindow.cpp @ 370]:
  370 00401570 51              push    ecx
  370 00401571 53              push    ebx
  371 00401572 8b1d7c204000    mov     ebx,dword ptr [badwindow!_imp__sprintf (0040207c)]
  371 00401578 55              push    ebp
  371 00401579 8b2d14204000    mov     ebp,dword ptr [badwindow!_imp__OutputDebugStringA (00402014)]
  371 0040157f 56              push    esi
  371 00401580 8b351c204000    mov     esi,dword ptr [badwindow!_imp__EnterCriticalSection (0040201c)]
  371 00401586 57              push    edi
  371 00401587 8b3d0c204000    mov     edi,dword ptr [badwindow!_imp__Sleep (0040200c)]
  371 0040158d c744241014000000 mov     dword ptr [esp+10h],14h

ENTERING CORRET CRITICAL SECTION csCritSec1

  374 00401595 6880374000      push    offset badwindow!csCritSec1 (00403780)
  374 0040159a ffd6            call    esi
  376 0040159c 68fa000000      push    0FAh
  376 004015a1 ffd7            call    edi
  377 004015a3 68e4214000      push    offset badwindow!`string' (004021e4)
  377 004015a8 6880334000      push    offset badwindow!szTrace (00403380)
  377 004015ad ffd3            call    ebx
  377 004015af 83c408          add     esp,8
  378 004015b2 6880334000      push    offset badwindow!szTrace (00403380)
  378 004015b7 ffd5            call    ebp

LEAVING CORRECT CRITICAL SECTION csCritSec1

  380 004015b9 6880374000      push    offset badwindow!csCritSec1 (00403780) 
  380 004015be ff1524204000    call    dword ptr [badwindow!_imp__LeaveCriticalSection
(00402024)]

  382 004015c4 836c241001      sub     dword ptr [esp+10h],1
  382 004015c9 75ca            jne     badwindow!hangtype2threadb+0x25 (00401595)
  382 004015cb 5f              pop     edi
  382 004015cc 5e              pop     esi
  382 004015cd 5d              pop     ebp
  382 004015ce 5b              pop     ebx
  387 004015cf 59              pop     ecx
  387 004015d0 c3              ret

 

Then just run the code (PRESS G then enter in the debugger) that’s it, it will work!

Once you have proven this you can go to the developer of the application and recommend they change their code, remember to provide your debug notes.

 

I hope you found this helpful and I welcome your feedback.

Thank you,  Jeff-

 

 

Leave a Comment
  • Please add 3 and 4 and type the answer here:
  • Post
  • 이 문서는 http://blogs.msdn.com/ntdebugging blog 의 번역이며 원래의 자료가 통보 없이 변경될 수 있습니다. 이 자료는 법률적 보증이 없으며 의견을 주시기

  • We’d like the thank everyone who attended the Windows NT Debugging Blog Live Chat two weeks ago. Here

  • We’d like the thank everyone who attended the Windows NT Debugging Blog Live Chat two weeks ago. Here

  • Isn't 'RecursionCount     20' a clue?

    [That just means the thread entered the critical section 20 times, it doesn't really indicate that the thread unlocked the wrong critsec.  There are legitimate reasons to enter the same critical section multiple times.]

Page 1 of 1 (4 items)