Rubato and Chord

Reiley's technical blog

  • Rubato and Chord

    Batch File with Self-Awareness

    • 0 Comments

    Batch file is a double-edged sword, the good side is that batch file runs on almost all of the Microsoft platforms, while the evil side is people just couldn't get it right.

    At the highest level, batch file is interpreted by the command processor, which is cmd.exe or command.com. The interpreter is in charge of:

    • Process escape sequences (e.g. caret).
    • Expand environment variables.
    • Split the command string into parts.
    • Determine if the command is an internal command (e.g. ECHO), alias (which is specified using DOSKEY or AddConsoleAlias), or external command (which is controlled by PATH and PATHEXT).
    • Maintain command process and return value.
    • Take care of code page, pipe and I/O redirection.

    What the command processor does not care are the C/C++ style "argv" parsing and double quotes - it is the responsibility of individual process to parse the command line, although in normal case this is handled by CRT or shell32!CommandLineToArgvW.

    Let's jump to today's question - how do you write a batch file that prints out the full path of itself? (hint: %~dpnx0 and %~f0 have bug and won't work if the batch is invoked with double quotes).

     

    I've put my answer as follows:

    @ECHO OFF
    
    SETLOCAL
    SETLOCAL ENABLEEXTENSIONS
    SETLOCAL ENABLEDELAYEDEXPANSION
    
    IF EXIST "%~f0" (
      IF NOT "%~x0" == "" (
        GOTO ENTRYPOINT
      )
    )
    
    SET PATH_CDR=%CD%;%PATH%
    :FINDNEXT
    FOR /F "delims=; tokens=1,*" %%P IN ("!PATH_CDR!") DO (
      SET PATH_CAR=%%P
      IF "!PATH_CAR:~-1!" == "\" (
        SET PATH_CAR=!PATH_CAR:~0,-1!
      )
      IF NOT "!PATH_CAR!" == "" (
        FOR %%X IN (%PATHEXT:;=;%) DO (
          IF EXIST "!PATH_CAR!\%~n0%%X" (
            SET PATH_CAR=!PATH_CAR!\%~n0%%X
            GOTO TRAMPOLINE
          )
        )
      )
      SET PATH_CDR=%%Q
      IF DEFINED PATH_CDR (
        GOTO FINDNEXT
      )
    )
    ECHO Error: failed to detect batch file path 1>&2
    EXIT /B -1
    
    :TRAMPOLINE
    CALL "!PATH_CAR!" %*
    EXIT /B !ERRORLEVEL!
    
    :ENTRYPOINT
    SETLOCAL DISABLEDELAYEDEXPANSION
    ECHO %~f0
    
    EXIT /B %ERRORLEVEL%
    
  • Rubato and Chord

    Pop Quiz - JavaScript for Fun

    • 0 Comments

    Most people could write something in JavaScript, though they barely made it correct.

    I started using JavaScript while I was in school, and the project I gave myself was to implement a Scheme interpreter that runs in web browsers. As a result, I've realized the tight relationship between JavaScript and Scheme, I became a fan of JavaScript and have been using it a lot.

    Before the quiz, let me tell a brief story of JavaScript.

    The JavaScript language was implemented by Brendan Eich of Netscape, soon Microsoft developed a dialect called JScript and shipped with Internet Explorer in 1996. Later in the same year the standardization started, and the first version was authored by Guy Steele, who also created Scheme and authored the Java language specification. The specification was named as ECMA-262, since the specification came after JavaScript and JScript, in order to compromise among Netscape, Microsoft and Sun, a new name ECMAScript was picked.

    Despite of it's Java-like syntax, JavaScript shares many concepts with functional programming language. This mixture makes JavaScript easy to use (also easy to misuse), expressive, powerful and popular.

    Microsoft has a long history of providing JavaScript implementations, including JScript, JScript.NET and Chakra. While the old JScript engine is in sustaining engineering mode, JScript.NET seems to be deprecated, it's quite promising that Chakra has been powering both the latest Internet Explorer and Windows Runtime.

    Now let's put the question - what would you get from the following page?


    <html>
    <head>
      <title>JavaScript for Fun</title>
      <script type="text/javascript">
        var S = {};
        for (var N in (function () { return this; }).call(function (_) { return _; }()))
          S[N] = true;
        this.undefined = 1;
        undefined = 2;
        var undefined = 3;
        var A = this.undefined;
        this.B = undefined;
    </script>
    </head>
    <body>
      <script type="text/javascript">
        var C = 'global var C';
        this.D = 'global this.D';
        function F() { };
        this.G = function () {
          return (function () { return this; }).call(function (_) { return _; }());
        };
        this.H = function () { };
        delete F;
        delete G;
        eval('var Z = "global eval var"');
        (function (global) {
          var undefined = 'defined';
          E = undefined;
          var X = 'local var X';
          this.Y = 'local this.Y';
          delete H;
          eval('I = "it just works"');
          eval('var J = "local eval var"');
          alert((function (O) {
            var R = [];
            for (var M in O)
              if (!S.hasOwnProperty(M))
                R.push(M + ': ' + String(O[M]));
            return '[' + R.sort().join(', ') + ']';
          })(global));
        })(this);
        var K = 'global var K';
    </script>
    </body>
    </html>
    

     

  • Rubato and Chord

    Vector Deleting Destructor

    • 0 Comments

    Today one guy in my team asked a question regarding the behavior of delete[] operator in C++ - how does the program know it needs to call CBar::~CBar instead of CFoo::~CFoo?

    Note that the vector deleting destructor is a feature in Microsoft C++ Compiler, not required by the C++ standard.

    #define _CRTDBG_MAP_ALLOC
    #include <malloc.h>
    #include <crtdbg.h>
    
    class CFoo
    {
    public:
      virtual ~CFoo() = 0;
    };
    
    class CBar
    {
      virtual ~CBar()
      {
        _CrtDbgBreak();
      }
    };
    
    void __cdecl main()
    {
      CBar* p = new CBar[1];
      delete[] (CFoo*)p; // what does this mean?
      _CrtDumpMemoryLeaks();
    } 

    Well the short answer would be - how does virtual destructor work and how does the program know the actual number of elements in the array?

    If you've spent time on understanding the object layout in C++ and COM, you would know that virtual functions are backed by vtable, and it will not be too difficult to understand BSTR.

    So we are ready for the answer - delete[] is a combination of BSTR and virtual function.

    CBar::`vector deleting destructor':
    000007F7CD2E1140  mov         dword ptr [rsp+10h],edx  
    000007F7CD2E1144  mov         qword ptr [rsp+8],rcx  
    000007F7CD2E1149  sub         rsp,28h  
    000007F7CD2E114D  mov         eax,dword ptr [rsp+38h]  
    000007F7CD2E1151 and eax,2
    000007F7CD2E1154 test eax,eax
    000007F7CD2E1156 je CBar::`vector deleting destructor'+5Eh (07F7CD2E119Eh)
    000007F7CD2E1158 lea r9,[CBar::~CBar (07F7CD2E1120h)] ; the address of CBar::~CBar 000007F7CD2E115F mov rax,qword ptr [this] ; this pointer 000007F7CD2E1164 mov r8d,dword ptr [rax-8] ; array size *((size_t*)this - 1) 000007F7CD2E1168 mov edx,8 ; sizeof(CBar) 000007F7CD2E116D mov rcx,qword ptr [this] ; this pointer 000007F7CD2E1172 call `vector destructor iterator' (07F7CD2E1220h)

    As you can see here, the vector deleting destructor was emitted as a virtual function in vtable, it takes a flag parameter, if flag & 0x2 equals to true, the actual function vector destructor iterator would get called.

    The size of array was stored using the BSTR approach, passed in via the r8d register.

    The callstack from Visual Studio Debugger also tells us the same thing:

    crtdbg.exe!`vector destructor iterator'(void * __t, unsigned __int64 __s, int __n, void (void *) * __f)	C++
    crtdbg.exe!CBar::`vector deleting destructor'(unsigned int)	C++
    crtdbg.exe!main() Line 22	C++
    crtdbg.exe!__tmainCRTStartup() Line 536	C
    crtdbg.exe!mainCRTStartup() Line 377	C
    kernel32.dll!BaseThreadInitThunk()	Unknown
    ntdll.dll!RtlUserThreadStart()	Unknown
    

    Regarding the actual meaning of the flag I mentioned, I'll leave it as a homework for the readers (hint: you may try out the DGML tool).

     

  • Rubato and Chord

    Undocumented Environment Variables

    • 0 Comments

    Although we have less Easter Eggs, there are still a huge number of undocumented behaviors.

    Recently I'm writing a CLR profiler using ICorProfilerCallback for fun, the CLR profiler was modeled as an in-proc COM server, and the activition was done through environment variables:

    • SET COR_ENABLE_PROFILING=1
    • SET COR_PROFILER={XXXXXXXX-XXXX-XXXX-XXXX-XXXXXXXXXXXX}
    • SET COR_PROFILER_PATH="C:\FOO\BAR\MyProfiler.dll"

    Immediately I realized there must be a lot more environment variables, and it was a perfect time to use WinDBG: 

    cdb.exe -hd -g -G -xi ld -xe cpr -c "bu KERNELBASE!GetEnvironmentVariableW \"du @rcx; gc\"; g" "%WINDIR%\Microsoft.NET\Framework64\v4.0.30319\csc.exe"
    0:000> cdb: Reading initial command 'bu KERNELBASE!GetEnvironmentVariableW "du @rcx; gc"; g'
    000007fe`581406f0  "SHIM_DEBUG_LEVEL"
    000007fe`58140428  "SHIM_FILE_LOG"
    000007fe`581406f0  "SHIM_DEBUG_LEVEL"
    SHIMVIEW: ShimInfo(Complete)
    (ba0.db8): Break instruction exception - code 80000003 (first chance)
    ntdll!LdrpDoDebuggerBreak+0x30:
    000007fe`5c4eada0 cc              int     3
    0:000> g
    Microsoft (R) Visual C# Compiler version 4.0.30319.17929
    for Microsoft (R) .NET Framework 4.5
    Copyright (C) Microsoft Corporation. All rights reserved.
    
    warning CS2008: No source files specified
    000000a3`60e9c230  "COMPlus_Version"
    000000a3`60e9c230  "COMPlus_InstallRoot"
    000000a3`60e9b790  "COMPlus_InstallRoot"
    000000a3`60e9c140  "COMPlus_DefaultVersion"
    000000a3`60e9bad0  "COMPlus_InstallRoot"
    000000a3`60e9b880  "COMPlus_InstallRoot"
    000000a3`60e9b8a0  "COMPlus_3gbEatMem"
    000000a3`60e9b400  "COMPLUS_CLRLoadLogDir"
    000000a3`60e9b1d0  "COMPlus_InstallRoot"
    000000a3`60e9b180  "COMPlus_NicPath"
    000000a3`60e9b180  "COMPlus_RegistryRoot"
    000000a3`60e9b180  "COMPlus_AssemblyPath"
    000000a3`60e9b180  "COMPlus_AssemblyPath2"
    000000a3`60e9ab80  "COMPLUS_InstallRoot"
    000000a3`60e9ab40  "COMPLUS_DefaultVersion"
    000000a3`60e9ab40  "COMPLUS_Version"
    000000a3`60e9af60  "COMPLUS_ApplicationMigrationRunt"
    000000a3`60e9afa0  "imeActivationConfigPath"
    000000a3`60e9afb0  "COMPLUS_OnlyUseLatestCLR"
    000007fe`51a06580  "APPX_PROCESS"
    000000a3`60e9af60  "COMPLUS_BuildFlavor"
    000007f6`2e5c16a8  "LIB"
    error CS1562: Outputs without source must have the /out option specified
    

    As we can see, the red parts are the environment variables used by Application Compatibility team for debugging a shim.

    The blue parts are the environment variables used by COMPLUS, which was the original name for .NET.

    Notice that there is no guarantee of completeness using my WinDBG approach, since GetEnvironmentVariableW is not the only way to retrieve environment variables.

    I'll leave a homework for the readers - find the environment variables consumed by the Visual Studio IDE (hint: you need to modify the command line to make it work under 32bit).

  • Rubato and Chord

    The Pit of Success

    • 0 Comments

    The Pit of Success: in stark contrast to a summit, a peak, or a journey across a desert to find victory through many trials and surprises, we want our customers to simply fall into winning practices by using our platform and frameworks. To the extent that we make it easy to get into trouble we fail.

    - by Rico Mariani

    In my self-introduction, I mentioned my past experience on developing ATL, CRT, MFC and STL. I've learned a lot from this experience, and as I work with people who develop services based on .NET platform, there are many interesting findings. Each time I stepped back and looked at things that have been developed, the more I love the "The Pit of Success" concept brought up by Rico Mariani. So here I'm trying to share some of my personal understanding.

    ATL (not including ATL Server) is my personal favorite library for the following reasons:

    • ATL is a library instead of framework
      • in order to adopt ATL you don't have to change any existing architecture
    • ATL is lightweight
      • in most cases you can just use ATL header files, without dealing with *.lib files and DLLs
      • ATL won't bloat the binary size too much
      • ATL won't harm performance for sure
    • ATL is predictable
      • debugging through ATL source is straightforward
      • the main purpose of ATL is to make COM programming easier in C++, not to hide the complexity of the operating system concepts
      • the COM wrapper has a very clear definition on the thread safety

    CRT is the library from which I learned a lot:

    • CRT acts as a bridge between the operating system and most applications
      • dealing with application startup and cleanup
      • providing runtime services (e.g. backing C++ exception scope with SEH and exception filters)
      • balancing between the underlying system implementation and the C/C++ standard specification
    • CRT has evil contracts with C++ compiler toolchain
    • CRT is not only consumed by normal applications, but also by operating system and compiler toolchain (which sounds like an infinite loop)
    • CRT always sits in the frontline of supporting new hardware architecture
    • CRT needs to be the most stable and compatible library

    MFC is a framework consisting of class libraries:

    • it's easy to get started with MFC
    • MFC takes care of window management, messaging, COM and many great things
    • you really need to understand MFC in order to use it well, especially when you run into bug hell

    STL is a C++ template library, which I've used extensively (in combination with Boost) when I was in school, when I developed STL and when I have good reason to use STL:

    • if I have a free choice, normally I would tend not to use STL
    • debugging template code is not easy, especially the code that makes extensive use of template meta-programming and macro meta-programming
    • running code coverage analysis against template code is tricky (source line coverage is almost meaningless, you have to use basic block coverage)
    • it's bad idea to pass (e.g. parameter, exception) STL objects across module boundary, since the underlying contract (e.g. object layout) might be different
    • binary size might bloat and turn into a serious problem

     

  • Rubato and Chord

    Error and Exception Revisited

    • 0 Comments

    Unless suffering is the direct and immediate object of life, our existence must entirely fail of its aim. It is absurd to look upon the enormous amount of pain that abounds everywhere in the world, and originates in needs and necessities inseparable from life itself, as serving no purpose at all and the result of mere chance. Each separate misfortune, as it comes, seems, no doubt, to be something exceptional; but misfortune in general is the rule.

    - by Arthur Schopenhauer

    In the world of programming, error and exceptions seem to be unavoidable, this is especially true when it comes to writing production quality code.

    Windows programming can be challenging when both error and exception are used.

    Exception

    Exception always exists, even if you don't care, it would disrupt the normal flow of execution. In Windows operating system exception is implemented as SEH/VEH, with the support from CPU.

    The key characteristics of exception in Windows are:

    1. Disrupt the normal flow of execution, which translates to pipeline invalidation and slowness.
    2. Can be used consistently in user mode and kernel mode.
    3. Has a lot of great features like continue execution, first chance versus second chance.
    4. Unhandled exception would go to the operating system, which would kill the application (e.g. Dr. Watson) or system (e.g. BSOD).
    5. If you swallow the exceptions used by the operating system or runtime (which you shouldn't catch), the application might not function correctly (e.g. you might get access violation while accessing the PAGE_GUARD page from callstack).

    The third famous exception is Out of Memory - for device drivers and server application, you always want to handle it; for client application, probably not as critical (e.g. if Visual Studio IDE is running out of memory, we just let it crash).

    The second famous exception is Stack Overflow - for hosting environment and fundamental libraries like CRT, you need to take it into consideration; in other cases it means you have design issue, and normally you don't want to handle it.

    Let's take a look at the following pseudo code:

    try
    {
      // do something
    }
    catch(StackOverflowException ex)
    {
      log("oops, stack overflow {0}", ex.stack);
      throw ex;
    }
    finally
    {
      // close file handle, etc.
    }
    

     There are several things I can tell:

    1. Having "throw ex" would ruin the exception information, better use "throw" instead.
    2. The "log" function doesn't have much stack space, it could trigger another stack overflow.
    3. When the exception was re-thrown, we are already miles away from the original place of the problem - we are not keeping the scene intact, and nobody would want to debug a dump file for this case.
    4. The process is dying, close file handle will not make it any better, operating system would do that for you. More importantly, it is very likely that you are making things even worse.
    5. If you are using latest version of .NET framework, normally you are NOT allowed to catch StackOverflowException:
      In prior versions of the .NET Framework, your application could catch a StackOverflowException object (for example, to recover from unbounded recursion). However, that practice is currently discouraged because significant additional code is required to reliably catch a stack overflow exception and continue program execution.

      Starting with the .NET Framework version 2.0, a StackOverflowException object cannot be caught by a try-catch block and the corresponding process is terminated by default. Consequently, users are advised to write their code to detect and prevent a stack overflow. For example, if your application depends on recursion, use a counter or a state condition to terminate the recursive loop. Note that an application that hosts the common language runtime (CLR) can specify that the CLR unload the application domain where the stack overflow exception occurs and let the corresponding process continue. For more information, see ICLRPolicyManager Interface and Hosting Overview.


      Windows 95, Windows 98, Windows 98 Second Edition, Windows Millennium Edition Platform Note: A thrown StackOverflowException cannot be caught by a try-catch block. Consequently, the exception causes the process to terminate immediately.

    The most famous exception is Access Violation, in normal cases this would be a killer bug which stop the entire development team.

    Error

    Error is more like a convention, the main characteristics of error are:

    1. Lightweight - doesn't require infrastructure support from operating system and CPU.
    2. Less picky comparing to exception - error wouldn't complain whether you've paid enough attention to it, however it might eventually kill you if not handled properly.

    Windows Error Code

    Most Win32 API would return BOOL value. If FALSE is returned, the error code is stored in TEB as DWORD, which can be retrieved by using GetLastError.

    You can use WinDBG's !ext.gle or Visual Studio's $ERR trick to retrieve the last error code from TEB, which also work for dump files.

    The good side of storing last error in a central place (e.g. TEB) is the ability to set Data Breakpoint to see where is the error coming from. Also in later version of Windows there is a feature to do similar things:

    Registry API in advapi32.dll is special, Windows error code is returned directly as LONG (signed long).

    WinSock and NetAPI have similar concept but different mapping.

    NTSTATUS

    Lower layer of the operating system makes use of NTSTATUS, which has a similar structure as Win32 error, an incomplete mapping table can be found from Mapping NT Status Error Codes to Win32 Error Codes.

    For LSA (Local Security Authority) specifically, the NTSTATUS return value can be converted to Win32 error using LsaNtStatusToWinError.

    HRESULT

    COM makes use of HRESULT, which was designed to be a super container for all kinds of error codes. There is a macro HRESULT_FROM_WIN32 (p.s. in later version of Windows this has been changed to an inline function) which converts Windows Error Code to HRESULT.

    One caveat about HRESULT is that one has to always keep in mind of S_OK, S_FALSE, SUCCEEDED and FAILED, and understand the differences.

     

    Since there are so many different kinds of error codes, even people working in Microsoft may get confused, that's why people tend to create tools and make the situation better:

    1. WinDBG extension command !ext.error, which supports both Windows Error Code and NTSTATUS.
    2. Visual Studio debugger.
    3. Error Code Look-up tool, implemented by the Exchange team.
  • Rubato and Chord

    A Debugging Approach to Windows RT

    • 1 Comments

    Recently I got a Surface with Windows RT. Needless to mention, it's wonderful!

    I've figured out some quick facts about Windows RT by looking at the C:\Windows\system32\ntdll.dll from Windows RT:

    • A complete NT (instead of WINCE) kernel and almost a full stack of Windows operating system.
    • Almost the same PE/COFF structure as x86.
    • Using ARM's "non classic RISC style" Thumb-2 instruction set (pImageNtHeaders->FileHeader.Machine == IMAGE_FILE_MACHINE_ARMNT), which has great code density, and in turn gives smaller binary and less memory pressure.

    I've never had a chance to debug Thumb-2 code before, so I've listed the things I need to grasp:

    • Fundamental ARM architecture and Thumb-2 instructions.
    • ABI (Application Binary Interface), calling convention and exception handling mechanism.
    • Programming and debugging.

    Programming

    Although Visual Studio 2012 doesn't have an ARM version, it does included the x86 cross toolchain which allows targeting ARM architecture, which can be found from %ProgramFiles(x86)%\Microsoft Visual Studio 11.0\VC\bin\x86_arm\. By setting the correct environment variable (INCLUDE, LIB, LIBPATH, PATH) we can generate ARM module smoothly.

    int main()
    {
      return 0;
    }
    
    C:\Program Files (x86)\Microsoft Visual Studio 11.0\VC\bin\x86_arm>cl.exe /link /NODEFAULTLIB /ENTRY:main test.cpp
    Microsoft (R) C/C++ Optimizing Compiler Version 17.00.51106.1 for ARM Copyright (C) Microsoft Corporation.  All rights reserved.
    
    test.cpp
    Microsoft (R) Incremental Linker Version 11.00.51106.1
    Copyright (C) Microsoft Corporation.  All rights reserved.
    
    /out:test.exe
    /machine:arm
    /NODEFAULTLIB
    /ENTRY:main
    test.obj
    

    By default Visual Studio doesn't allow generating native ARM binary, and the restriction was from a MSBuild property named WindowsSDKDesktopARMSupport, by setting this property to true I could target ARM native without an issue.

    Another problem is that Windows SDK doesn't have the ARM version of import libraries, which means we don't have files like gdi32.lib and shell32.lib. The solution would be creating one either by writing a DEF file, or creating a stub module. Since we can get the ARM version of DLLs from Windows RT, it is easy to dump the export directory and create DEF file automatically, as long as the DLLs we use is not exporting mangled name (otherwise I would prefer to use stub module approach).

    This is what I got while running my very first hello.exe on Windows RT :)

    For pure managed code programming, the .NET runtime in Surface RT comes with the standard C# compiler:

    class Hello
    {
      static void Main()
      {
      }
    }
    
    C:\Users\Reiley\Desktop>"%WINDIR%\Microsoft.NET\Framework\v4.0.30319\csc.exe" /noconfig /debug+ /platform:anycpu hello.cs
    Microsoft (R) Visual C# Compiler version 4.0.30319.17929 
    for Microsoft (R) .NET Framework 4.5
    Copyright (C) Microsoft Corporation. All rights reserved.
    

    And this time I got something worse:

    After changing the compiler flag to target ARM instead of AnyCPU, it became better and I was again greeted with the "Windows cannot verify the digital signature ..." dialog.

    C:\Users\Reiley\Desktop>"%WINDIR%\Microsoft.NET\Framework\v4.0.30319\csc.exe" /noconfig /debug+ /platform:arm hello.cs
    

    Debugging

    The x86 and amd64 version of WinDBG both support various architectures including ARM Thumb-2, which means you can open dump files from a PC. This is a good place to get started, and actually I copied notepad.exe from Surface to my PC and used cdb.exe -z notepad.exe to familiar myself with the ARM PE structure and disassembly.

    There is no ARM version of WinDBG available for public download. Also ntsd.exe is no longer shipped as part of the Windows since Vista.

    Visual Studio 2012 comes with a great debugger, together with a fantastic remote debugging agent. Jason Zander already explained how to setup remote debugging from his great article "What you need to know about developing for Windows on ARM". I'll just put a conclusion here:

    1. Visual Studio 2012 doesn't have ARM version, in fact only x86 version is available.
    2. Visual Studio 2012 supports remote kernel debugging, however there is no direct way to enable kernel debugging on Windows RT device.
    3. Visual Studio 2012 comes with an ARM version of remote debugging agent, which makes it possible to do user mode debugging on nearly all processes. To unleash the power, run remote debugging agent (msvsmon) as a service under an administrator account.
    4. User mode debugging on Windows RT is powerful enough that you can do whatever hack you like.

     

    (to be continued...)

  • Rubato and Chord

    Postmortem Debugging - Better Late Than Never

    • 0 Comments

    If there is a consistent repro, I would definitely prefer Early Debugging. However in the real life postmortem debugging seems to be unavoidable. 

    There are three concepts I wish to clarify before digging into the details:

    1. AeDebug is a set of registry keys which specify the behavior when unhandled exception happened in an user mode application.

      • \\HKEY_LOCAL_MACHINE\Software\Microsoft\Windows NT\CurrentVersion\AeDebug
      • \\HKEY_LOCAL_MACHINE\Software\Wow6432Node\Microsoft\Windows NT\CurrentVersion\AeDebug

      By default AeDebug is configured to use drwtsn32.exe, which would capture a dump and terminate the application in problem.

    2. Just-In-Time Debugging (a.k.a. JIT Debugging) is a feature provided by most debuggers (e.g. CDB, NTSD, WinDBG and Visual Studio Debugger), which allows the debugger to be launched and attached to the application in problem.

      The JIT debugger shipped with Visual Studio is called vsjitdebugger.exe, which would pop up a window and let you decide the next step. Visual Studio stepped further by allowing JIT debugging for scripts.

      Needless to mention, JIT Debugging is normally built on top of AeDebug.

    3. Postmortem Debugging is an overloaded term which could mean debugging a dump, or JIT debugging.

      Since I will cover JIT debugging in another article, I would prefer referring dump file debugging as Postmortem Debugging. 

    Okay, now let's go back to the topic, what would you do after receiving a dump file?

    1. Understand the source of the dump file - under which condition was the dump file generated. Once you've confirmed the dump is coming from a trusted source, try to find out when and where the dump file was taken.

      0:001> .time
      Debug session time: Mon Dec 3 17:36:58.997 2012 (UTC - 8:00)
      System Uptime: 2 days 23:31:41.638
      Process Uptime: 0 days 0:00:14.156
      Kernel time: 0 days 0:00:00.015
      User time: 0 days 0:00:00.000

      0:001> vertarget
      Windows 7 Version 7601 (Service Pack 1) MP (8 procs) Free x64 Product: LanManNt, suite: Enterprise TerminalServer SingleUserTS
      kernel32.dll version: 6.1.7601.17514 (win7sp1_rtm.101119-1850)
      Machine Name:
      Debug session time: Mon Dec 3 18:37:21.103 2012 (UTC - 8:00)
      System Uptime: 3 days 0:32:03.743
      Process Uptime: 0 days 1:00:36.261
      Kernel time: 0 days 0:00:00.015
      User time: 0 days 0:00:00.000

      0:000> .lastevent
      Last event: 14d0.1874: Break instruction exception - code 80000003 (first chance)


    2. Check the dump file type - mini dump or full dump, kernel dump or user mode dump, whether the dump contains an exception record. Normally WinDBG would display the dump type when you open a dump file, here we'll use the command learned in Undocumented WinDBG.
        
      0:001> .dumpdebug
      ----- User Mini Dump Analysis
      MINIDUMP_HEADER:
      Version         A793 (6804)
      NumberOfStreams 14
      Flags           9164
                      0004 MiniDumpWithHandleData
                      0020 MiniDumpWithUnloadedModules
                      0040 MiniDumpWithIndirectlyReferencedMemory
                      0100 MiniDumpWithProcessThreadData
                      1000 MiniDumpWithThreadInfo
                      8000 MiniDumpWithFullAuxiliaryState

    If it's a user mode dump, additional information needs to be retrieved from the dump.

    1. What is the command line, and whether the process is a generic host such like dllhost.exe, svchost.exe taskhost.exe and w3wp.exe.
    2. Understand the bitness - whether it is a 64bit process or 32bit process. It would be tricky while debugging a 64bit dump of WOW32 process.
    3. Whether CLR is involved, and what is the CLR version (note there could be more than one CLR hosted).

    (to be continued...)

  • Rubato and Chord

    Windows 8 and conhost.exe

    • 1 Comments

    While debugging a console application on Windows 8, I noticed the console application is trying to create a process in the very beginning:

    windbg.exe -xe ld:ntdll.dll -c "bm ntdll!*CreateProcess*; g; k" cmd.exe

    CommandLine: cmd.exe
    ModLoad: 000007ff`01d60000 000007ff`01f1e000   ntdll.dll
    ntdll!RtlUserThreadStart:
    000007ff`01d7c3d0 4883ec48        sub     rsp,48h
    Processing initial command 'bm ntdll!*CreateProcess*; g; k'
    0:000> bm ntdll!*CreateProcess*; g; k
      1: 000007ff`01d90f60 @!"ntdll!RtlCreateProcessParametersEx"
      2: 000007ff`01d63070 @!"ntdll!NtCreateProcessEx"
    breakpoint 2 redefined
      2: 000007ff`01d63070 @!"ntdll!ZwCreateProcessEx"
      3: 000007ff`01e1bf74 @!"ntdll!RtlCreateProcessReflection"
      4: 000007ff`01da8bb4 @!"ntdll!RtlpCreateProcessRegistryInfo"
      5: 000007ff`01e1ceac @!"ntdll!RtlCreateProcessParameters"
      6: 000007ff`01d63651 @!"ntdll!ZwCreateProcess"
    breakpoint 6 redefined
      6: 000007ff`01d63651 @!"ntdll!NtCreateProcess"
    Breakpoint 1 hit
    Child-SP          RetAddr           Call Site
    000000bf`8268e558 000007fe`feea02a4 ntdll!RtlCreateProcessParametersEx
    000000bf`8268e560 000007fe`feea00be KERNELBASE!ConsoleLaunchServerProcess+0x60
    000000bf`8268e5f0 000007fe`fee95d40 KERNELBASE!ConsoleAllocate+0xf6
    000000bf`8268e8c0 000007fe`fee7f6db KERNELBASE!ConsoleInitialize+0x1d1
    000000bf`8268e950 000007fe`fee7230d KERNELBASE!KernelBaseBaseDllInitialize+0x4dd
    000000bf`8268ec20 000007ff`01d6b9be KERNELBASE!KernelBaseDllInitialize+0xd
    000000bf`8268ec50 000007ff`01d8b3fc ntdll!LdrpCallInitRoutine+0x3e
    000000bf`8268eca0 000007ff`01d8a88b ntdll!LdrpInitializeNode+0x192
    000000bf`8268eda0 000007ff`01d8e74e ntdll!LdrpInitializeGraph+0x6f
    000000bf`8268ede0 000007ff`01d8c322 ntdll!LdrpInitializeGraph+0x8d
    000000bf`8268ee20 000007ff`01d8cc02 ntdll!LdrpPrepareModuleForExecution+0x1a5
    000000bf`8268ee70 000007ff`01d8337b ntdll!LdrpLoadDll+0x344
    000000bf`8268f0a0 000007ff`01d9264f ntdll!LdrLoadDll+0xa7
    000000bf`8268f120 000007ff`01d91826 ntdll!LdrpInitializeProcess+0x1664
    000000bf`8268f420 000007ff`01d7c1ae ntdll!_LdrpInitialize+0x1565e
    000000bf`8268f490 00000000`00000000 ntdll!LdrInitializeThunk+0xe

    0:000> dc rbx
    000000bf`8268e660  00000000 00000000 00000000 00000000  ................
    000000bf`8268e670  003f005c 005c003f 003a0043 0057005c  \.?.?.\.C.:.\.W.
    000000bf`8268e680  004e0049 004f0044 00530057 0073005c  I.N.D.O.W.S.\.s.
    000000bf`8268e690  00730079 00650074 0033006d 005c0032  y.s.t.e.m.3.2.\.
    000000bf`8268e6a0  006f0063 0068006e 0073006f 002e0074  c.o.n.h.o.s.t...
    000000bf`8268e6b0  00780065 00200065 00780030 00660066  e.x.e. .0.x.f.f.
    000000bf`8268e6c0  00660066 00660066 00660066 00000000  f.f.f.f.f.f.....
    000000bf`8268e6d0  8268e960 000000bf 00000008 00000000  `.h.............

    This means conhost.exe process on Windows 8 will be created by the console application itself, instead of the CSRSS. And conhost.exe would always have the native bitness (on Windows 8 64bit version, only 64bit version of conhost.exe is available).

    Now debug into conhost.exe using .childdbg, it's pretty clear that conhost.exe is in charge of drawing the console window, handling user inputs and communicate with the console application:

    0  Id: 124c.d34 Suspend: 1 Teb: 000007f6`3311b000 Unfrozen
    Child-SP          RetAddr           Call Site
    00000094`85aefb38 000007f6`33b91146 ntdll!NtWaitForSingleObject+0xa
    00000094`85aefb40 000007ff`00c9167e conhost!ConsoleIoThread+0xda
    00000094`85aefd80 000007ff`01d7c3f1 KERNEL32!BaseThreadInitThunk+0x1a
    00000094`85aefdb0 00000000`00000000 ntdll!RtlUserThreadStart+0x1d
    #  1  Id: 124c.1428 Suspend: 1 Teb: 000007f6`3311e000 Unfrozen
    Child-SP          RetAddr           Call Site
    00000094`85b6fd28 000007ff`0140171e conhost!ConsoleWindowProc
    00000094`85b6fd30 000007ff`014014d7 USER32!UserCallWinProcCheckWow+0x13a
    00000094`85b6fdf0 000007f6`33b92fcc USER32!DispatchMessageWorker+0x1a7
    00000094`85b6fe70 000007ff`00c9167e conhost!ConsoleInputThread+0xd2
    00000094`85b6fed0 000007ff`01d7c3f1 KERNEL32!BaseThreadInitThunk+0x1a
    00000094`85b6ff00 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

    And cmd.exe itself doesn't draw the console window at all:

    0  Id: 1aec.5ec Suspend: 1 Teb: 000007f7`b280d000 Unfrozen
    Child-SP          RetAddr           Call Site
    00000008`50fff5b8 000007fe`fee8f17c ntdll!NtDeviceIoControlFile+0xa
    00000008`50fff5c0 000007fe`fef0bb29 KERNELBASE!ConsoleCallServerGeneric+0x118
    00000008`50fff710 000007fe`fef0b986 KERNELBASE!ReadConsoleInternal+0x131
    00000008`50fff850 000007f7`b3621025 KERNELBASE!ReadConsoleW+0x1a
    00000008`50fff890 000007f7`b362bd3e cmd!ReadBufFromConsole+0x111
    00000008`50fff960 000007f7`b3604aae cmd!_chkstk+0x3820
    00000008`50fffae0 000007f7`b36042e4 cmd!Lex+0x4be
    00000008`50fffb50 000007f7`b362d560 cmd!Parser+0x128
    00000008`50fffba0 000007f7`b361b721 cmd!_chkstk+0x5032
    00000008`50fffc00 000007ff`00c9167e cmd!mystrchr+0x27d
    00000008`50fffc40 000007ff`01d7c3f1 KERNEL32!BaseThreadInitThunk+0x1a
    00000008`50fffc70 00000000`00000000 ntdll!RtlUserThreadStart+0x1d

    The interesting thing is that if we use Spy++, it would report that the console window is associated with the main thread of cmd.exe process! I believe this is a hack in the underlying implementation of GetWindowThreadProcessId for backward compatibility. Also, Spy++ cannot be used to inspect conhost.exe message loop.

    Due to the side effects of IFEO Debugger, console application would fail to start if IFEO Debugger is enabled for conhost.exe.

    The following call stack showed conhost.exe just created a normal window from Console Input Thread:

    USER32!CreateWindowExW
    conhost!CreateWindowsWindow+0x105
    conhost!InitWindowsSubsystem+0x69
    conhost!ConsoleInputThread+0x22
    KERNEL32!BaseThreadInitThunk+0x1a
    ntdll!RtlUserThreadStart+0x1d
    0:001> du @rdx
    000007f6`33b9d460  "ConsoleWindowClass"
    0:001> du @r8
    00000025`f7dc37c0  "C:\WINDOWS\SYSTEM32\cmd.exe"

    When cmd.exe exits, the Console I/O Thread would be notified:

    ntdll!NtTerminateProcess+0xa
    ntdll!RtlExitUserProcess+0xb6
    conhost!ConsoleIoThread+0xac4
    KERNEL32!BaseThreadInitThunk+0x1a
    ntdll!RtlUserThreadStart+0x1d

    (to be continued...)

     

  • Rubato and Chord

    Visualize Assembly using DGML

    • 0 Comments

    Starting from Visual Studio 2010 Ultimate there is a cool feature called DGML (Directed Graph Markup Language).

    I wrote a small script to convert the disassembled code from WinDBG into a DGML.

    In order to use it, simply type the following commands under a debug session:

    .shell -o LoadLibraryA.dgml -ci "uf kernel32!LoadLibraryA" cscript.exe /nologo dasm2dgml.js

    A DGML file will be generated with the given name, and here is what it looks like:

    Here is the source code:


    var EBB = [];
    
    var hypertext=function(s){
      var r=[],L=s.length;
      for(var i=0;i<L;i++){
        var c=s.charAt(i);
        switch(c){
          case '"':r.push('&quot;');break;
          case '&':r.push('&amp;');break;
          case '<':r.push('&lt;');break;
          case '>':r.push('&gt;');break;
          default:r.push(c);}}
      return r.join('');
    };
    
    var map=function(f,v){var L=v.length,r=[];for(var i=0;i<L;i++)r.push(f(v[i]));return r;};
    
    (function(){
      var blk;
    
      var CExtendedBasicBlock = function(name, previous, next){
        this.Address = '';
        this.Code = [];
        this.Name = name;
        this.Previous = previous;
        this.Next = next;
      };
    
      while(true)
      {
        if(WScript.StdIn.AtEndOfStream)
          break;
        var strSourceLine = WScript.StdIn.ReadLine().replace(/(^\s+)|(\s+$)/g, '');
        if(!strSourceLine)
          continue;
        if(strSourceLine.match(/.*:$/))
        {
          blk = new CExtendedBasicBlock(strSourceLine.slice(0, -1));
          EBB.push(blk);
        }
        else
        {
          blk.Address = blk.Address || strSourceLine.match(/^[^\s]+/)[0];
          blk.Code.push(strSourceLine.replace(/[^\s]*\s+/, '').replace(/[^\s]*\s+/, ''));
        }
      }
    })();
    
    EBB = EBB.sort(function(x, y){ return x.Address == y.Address ? 0 : x.Address > y.Address ? 1 : -1; });
    for(var i = 1; i < EBB.length; i++)
    {
      EBB[i].Previous = EBB[i - 1];
      EBB[i].Previous.Next = EBB[i];
    }
    
    WScript.Echo('<DirectedGraph Background="#FFFFFF" GraphDirection="TopToBottom" xmlns="http://schemas.microsoft.com/vs/2009/dgml">');
    WScript.Echo('  <Nodes>');
    map(function(blk){
      var content = hypertext(blk.Name + ' (' + blk.Address + ')') + '&#xD;&#xA;';
      map(function(instruction){
        content += '&#xD;&#xA;' + hypertext(instruction);
      }, blk.Code);
      WScript.Echo('    <Node Id="' + hypertext(blk.Name) + '" Label="' + content + '" />');
    }, EBB);
    WScript.Echo('  </Nodes>');
    WScript.Echo('  <Links>');
    map(function(blk){
      map(function(instruction){
        map(function(x){
          var idx = instruction.indexOf(x.Name);
          idx = idx >= 0 ? instruction.charAt(idx + x.Name.length) : -1;
          if(idx == '' || idx == ' ')
            WScript.Echo('    <Link Source="' + hypertext(blk.Name) + '" Target="' + hypertext(x.Name) + '" />');
        }, EBB);
      }, blk.Code);
      if(blk.Next && !(blk.Code[blk.Code.length - 1].match(/^[^\s]+/)[0] in {jmp: 0, ret: 0}))
        WScript.Echo('    <Link Category="FallThrough" Source="' + hypertext(blk.Name) + '" Target="' + hypertext(blk.Next.Name) + '" />');
    }, EBB);
    WScript.Echo('  </Links>');
    WScript.Echo('  <Styles>');
    WScript.Echo('    <Style TargetType="Node">');
    WScript.Echo('      <Setter Property="FontFamily" Value="Consolas" />');
    WScript.Echo('      <Setter Property="FontSize" Value="11" />');
    WScript.Echo('      <Setter Property="Background" Value="White" />');
    WScript.Echo('      <Setter Property="NodeRadius" Value="2" />');
    WScript.Echo('    </Style>');
    WScript.Echo('    <Style TargetType="Link">');
    WScript.Echo('        <Condition Expression="HasCategory(\'FallThrough\')" />');
    WScript.Echo('        <Setter Property="Background" Value="Red" />');
    WScript.Echo('        <Setter Property="Stroke" Value="Red" />');
    WScript.Echo('    </Style>');
    WScript.Echo('  </Styles>');
    WScript.Echo('</DirectedGraph>');
    

    Notes:

    1. This script cannot generate 100% accurate control flow diagram, you will have to do further analysis (e.g. jmp eax).
    2. I haven't got a chance to test under WOA (ARM32), so I leave it as a homework for our readers.

    Enjoy:)

Page 1 of 4 (35 items) 1234