Rubato and Chord

Reiley's technical blog

April, 2013

  • Rubato and Chord

    The Pit of Success


    The Pit of Success: in stark contrast to a summit, a peak, or a journey across a desert to find victory through many trials and surprises, we want our customers to simply fall into winning practices by using our platform and frameworks. To the extent that we make it easy to get into trouble we fail.

    - by Rico Mariani

    In my self-introduction, I mentioned my past experience on developing ATL, CRT, MFC and STL. I've learned a lot from this experience, and as I work with people who develop services based on .NET platform, there are many interesting findings. Each time I stepped back and looked at things that have been developed, the more I love the "The Pit of Success" concept brought up by Rico Mariani. So here I'm trying to share some of my personal understanding.

    ATL (not including ATL Server) is my personal favorite library for the following reasons:

    • ATL is a library instead of framework
      • in order to adopt ATL you don't have to change any existing architecture
    • ATL is lightweight
      • in most cases you can just use ATL header files, without dealing with *.lib files and DLLs
      • ATL won't bloat the binary size too much
      • ATL won't harm performance for sure
    • ATL is predictable
      • debugging through ATL source is straightforward
      • the main purpose of ATL is to make COM programming easier in C++, not to hide the complexity of the operating system concepts
      • the COM wrapper has a very clear definition on the thread safety

    CRT is the library from which I learned a lot:

    • CRT acts as a bridge between the operating system and most applications
      • dealing with application startup and cleanup
      • providing runtime services (e.g. backing C++ exception scope with SEH and exception filters)
      • balancing between the underlying system implementation and the C/C++ standard specification
    • CRT has evil contracts with C++ compiler toolchain
    • CRT is not only consumed by normal applications, but also by operating system and compiler toolchain (which sounds like an infinite loop)
    • CRT always sits in the frontline of supporting new hardware architecture
    • CRT needs to be the most stable and compatible library

    MFC is a framework consisting of class libraries:

    • it's easy to get started with MFC
    • MFC takes care of window management, messaging, COM and many great things
    • you really need to understand MFC in order to use it well, especially when you run into bug hell

    STL is a C++ template library, which I've used extensively (in combination with Boost) when I was in school, when I developed STL and when I have good reason to use STL:

    • if I have a free choice, normally I would tend not to use STL
    • debugging template code is not easy, especially the code that makes extensive use of template meta-programming and macro meta-programming
    • running code coverage analysis against template code is tricky (source line coverage is almost meaningless, you have to use basic block coverage)
    • it's bad idea to pass (e.g. parameter, exception) STL objects across module boundary, since the underlying contract (e.g. object layout) might be different
    • binary size might bloat and turn into a serious problem


  • Rubato and Chord

    Error and Exception Revisited


    Unless suffering is the direct and immediate object of life, our existence must entirely fail of its aim. It is absurd to look upon the enormous amount of pain that abounds everywhere in the world, and originates in needs and necessities inseparable from life itself, as serving no purpose at all and the result of mere chance. Each separate misfortune, as it comes, seems, no doubt, to be something exceptional; but misfortune in general is the rule.

    - by Arthur Schopenhauer

    In the world of programming, error and exceptions seem to be unavoidable, this is especially true when it comes to writing production quality code.

    Windows programming can be challenging when both error and exception are used.


    Exception always exists, even if you don't care, it would disrupt the normal flow of execution. In Windows operating system exception is implemented as SEH/VEH, with the support from CPU.

    The key characteristics of exception in Windows are:

    1. Disrupt the normal flow of execution, which translates to pipeline invalidation and slowness.
    2. Can be used consistently in user mode and kernel mode.
    3. Has a lot of great features like continue execution, first chance versus second chance.
    4. Unhandled exception would go to the operating system, which would kill the application (e.g. Dr. Watson) or system (e.g. BSOD).
    5. If you swallow the exceptions used by the operating system or runtime (which you shouldn't catch), the application might not function correctly (e.g. you might get access violation while accessing the PAGE_GUARD page from callstack).

    The third famous exception is Out of Memory - for device drivers and server application, you always want to handle it; for client application, probably not as critical (e.g. if Visual Studio IDE is running out of memory, we just let it crash).

    The second famous exception is Stack Overflow - for hosting environment and fundamental libraries like CRT, you need to take it into consideration; in other cases it means you have design issue, and normally you don't want to handle it.

    Let's take a look at the following pseudo code:

      // do something
    catch(StackOverflowException ex)
      log("oops, stack overflow {0}", ex.stack);
      throw ex;
      // close file handle, etc.

     There are several things I can tell:

    1. Having "throw ex" would ruin the exception information, better use "throw" instead.
    2. The "log" function doesn't have much stack space, it could trigger another stack overflow.
    3. When the exception was re-thrown, we are already miles away from the original place of the problem - we are not keeping the scene intact, and nobody would want to debug a dump file for this case.
    4. The process is dying, close file handle will not make it any better, operating system would do that for you. More importantly, it is very likely that you are making things even worse.
    5. If you are using latest version of .NET framework, normally you are NOT allowed to catch StackOverflowException:
      In prior versions of the .NET Framework, your application could catch a StackOverflowException object (for example, to recover from unbounded recursion). However, that practice is currently discouraged because significant additional code is required to reliably catch a stack overflow exception and continue program execution.

      Starting with the .NET Framework version 2.0, a StackOverflowException object cannot be caught by a try-catch block and the corresponding process is terminated by default. Consequently, users are advised to write their code to detect and prevent a stack overflow. For example, if your application depends on recursion, use a counter or a state condition to terminate the recursive loop. Note that an application that hosts the common language runtime (CLR) can specify that the CLR unload the application domain where the stack overflow exception occurs and let the corresponding process continue. For more information, see ICLRPolicyManager Interface and Hosting Overview.

      Windows 95, Windows 98, Windows 98 Second Edition, Windows Millennium Edition Platform Note: A thrown StackOverflowException cannot be caught by a try-catch block. Consequently, the exception causes the process to terminate immediately.

    The most famous exception is Access Violation, in normal cases this would be a killer bug which stop the entire development team.


    Error is more like a convention, the main characteristics of error are:

    1. Lightweight - doesn't require infrastructure support from operating system and CPU.
    2. Less picky comparing to exception - error wouldn't complain whether you've paid enough attention to it, however it might eventually kill you if not handled properly.

    Windows Error Code

    Most Win32 API would return BOOL value. If FALSE is returned, the error code is stored in TEB as DWORD, which can be retrieved by using GetLastError.

    You can use WinDBG's ! or Visual Studio's $ERR trick to retrieve the last error code from TEB, which also work for dump files.

    The good side of storing last error in a central place (e.g. TEB) is the ability to set Data Breakpoint to see where is the error coming from. Also in later version of Windows there is a feature to do similar things:

    Registry API in advapi32.dll is special, Windows error code is returned directly as LONG (signed long).

    WinSock and NetAPI have similar concept but different mapping.


    Lower layer of the operating system makes use of NTSTATUS, which has a similar structure as Win32 error, an incomplete mapping table can be found from Mapping NT Status Error Codes to Win32 Error Codes.

    For LSA (Local Security Authority) specifically, the NTSTATUS return value can be converted to Win32 error using LsaNtStatusToWinError.


    COM makes use of HRESULT, which was designed to be a super container for all kinds of error codes. There is a macro HRESULT_FROM_WIN32 (p.s. in later version of Windows this has been changed to an inline function) which converts Windows Error Code to HRESULT.

    One caveat about HRESULT is that one has to always keep in mind of S_OK, S_FALSE, SUCCEEDED and FAILED, and understand the differences.


    Since there are so many different kinds of error codes, even people working in Microsoft may get confused, that's why people tend to create tools and make the situation better:

    1. WinDBG extension command !ext.error, which supports both Windows Error Code and NTSTATUS.
    2. Visual Studio debugger.
    3. Error Code Look-up tool, implemented by the Exchange team.
Page 1 of 1 (2 items)