November, 2012

  • The Old New Thing

    Microsoft Money crashes during import of account transactions or when changing a payee of a downloaded transaction

    • 49 Comments
    Update: An official fix for this issue has been released to Windows Update, although I must say that I think my patch has more style than the official one. You do not need to patch your binary. Just keep your copy of Windows 8 up to date and you'll be fine.

    For the five remaining Microsoft Money holdouts (meekly raises hand), here's a patch for a crashing bug during import of account transactions or when changing a payee of a downloaded transaction in Microsoft Money Sunset Deluxe. Patch the mnyob99.dll file as follows:

    • File offset 003FACE8: Change 85 to 8D
    • File offset 003FACED: Change 50 to 51
    • File offset 003FACF0: Change FF to 85
    • File offset 003FACF6: Change E8 to B9

    Note that this patch is completely unsupported. If it makes your computer explode or transfers all your money to an account in the Cayman Islands, well, too bad for you.

    If you are not one of the five remaining customers of Microsoft Money, this is a little exercise in application compatibility debugging. Why application compatibility debugging? Because the problem seems to be more prevalent on Windows 8 machines.

    Note that I used no special knowledge about Microsoft Money. All this debugging was performed with information you also have access to. It's not like I have access to the Microsoft Money source code. And I did this debugging entirely on my own. It was not part of any official customer support case or anything like that. I was just debugging a crash that I kept hitting.

    The crash occurs in the function utlsrf08!DwStringLengthA:

    utlsrf08!DwStringLengthA:
            push    ebp
            mov     ebp,esp
            mov     eax,dword ptr [ebp+8]
            lea     edx,[eax+1]
    again:
            mov     cl,byte ptr [eax]
            inc     eax
            test    cl,cl
            jne     again
            sub     eax,edx
            pop     ebp
            ret     4
    

    The proximate cause is that the string pointer in eax is garbage. If you unwind the stack one step, you'll see that the pointer came from here:

            lea     eax,[ebp-20Ch]
            push    eax
            call    dword ptr [__imp__GetCurrentProcessId]
            push    eax
            push    offset "Global\TRIE@%d!%s"
            lea     eax,[ebp-108h]
            push    104h
            push    eax
            call    mnyob99!DwStringFormatA
            add     esp,14h
            lea     eax,[ebp-2E4h]
            push    eax
            push    5Ch
            push    dword ptr [ebp-2E4h] ; invalid pointer
            call    mnyob99!DwStringLengthA
            sub     eax,7
            push    eax
            lea     eax,[ebp-101h]
            push    eax
            jmp     l2
    l1:
            mov     eax,dword ptr [ebp-2E4h]
            mov     byte ptr [eax],5Fh
            lea     eax,[ebp-2E4h]
            push    eax
            push    5Ch
            push    dword ptr [ebp-2E4h]
            call    mnyob99!DwStringLengthA
            push    eax
            push    dword ptr [ebp-2E4h]
    l2:
            call    mnyob99!FStringFindCharacterA
            cmp     dword ptr [ebp-2E4h],edi
            jne     l1
    

    I was lucky in that all the function calls here were to imported functions, so I could extract the names from the imported function table. For example, the call to DwStringFormatA was originally

            call    mnyob99!CBillContextMenu::SetHwndNotifyOnGoto+0x1e56a (243fc3cc)
    

    But the target address is an import stub:

            jmp     dword ptr [mnyob99+0x1ec0 (24001ec0)]
    

    And then I can walk the import table to see that this was the import table entry for utlsrf08!DwStringFormatA. From the function name, it's evident that this is some sort of sprintf-like function. (If you disassemble it, you'll see that it's basically a wrapper around vsnprintf.)

    Reverse-compiling this code, we get

    char name[...];
    char buffer[MAX_PATH];
    char *backslash;
    ...
    DwStringFormatA(buffer, MAX_PATH, "Global\\TRIE@%d!%s",
                    GetCurrentProcessId(), name);
    
    // Change all backslashes (except for the first one) to underscores
    if (FStringFindCharacterA(buffer + 7, DwStringLengthA(backslash) - 7,
                              '\\',&backslash))
    {
      do {
        *backslash = '_'; // Change backslash to underscore
      } while (FStringFindCharacterA(backslash, DwStringLengthA(backslash),
                                     '\\',&backslash));
    }
    

    (Remember, all variable names are made-up since I don't have source code access. I'm just working from the disassembly.)

    At this point, you can see the bug: It's an uninitialized variable at the first call to String­Find­CharacterA. Whether we crash or survive is a matter of luck. If the uninitialized variable happens to be a pointer to readable data, then the Dw­String­LengthA will eventually find the null terminator, and since in practice the string does not contain any extra backslashes, the call to FString­Find­CharacterA fails, and nobody gets hurt.

    But it looks like their luck ran out, and now the uninitialized variable contains something that is not a valid pointer.

    The if test should have been

    if (FStringFindCharacterA(buffer + 7, DwStringLengthA(buffer) - 7,
                              '\\',&backslash))
    

    This means changing the

            push    dword ptr [ebp-2E4h]
    

    to

            lea     eax,[ebp-101h]
            push    eax
    

    Unfortunately, the patch is one byte larger than the existing code, so we will need to get a little clever in order to get it to fit.

    One trick is to rewrite the test as

    if (FStringFindCharacterA(buffer + 7, DwStringLengthA(buffer + 7),
                              '\\',&backslash))
    

    That lets us rewrite the assembly code as

            lea     eax,[ebp-2E4h]
            push    eax
            push    5Ch
            lea     eax,[ebp-101h]          ; \ was "push dword ptr [ebp-2E4h]"
            push    eax                     ; /
            call    mnyob99!DwStringLengthA ; unchanged but code moved down one byte
            nop                             ; \ was "sub eax,7" (3-byte instruction)
            nop                             ; /
            push    eax
            lea     eax,[ebp-101h]
            push    eax
    

    The new instructions (lea and push) are one byte larger than the original push, but we got rid of the three-byte sub eax, 7, so it's a net savings of two bytes, which therefore fits.

    However, I'm going to crank the nerd level up another notch and try to come up with a patch that involves modifying as few bytes as possible. In other words, I'm going for style points.

    To do this, I'm going to take advantage of the fact that the string length is the return value of Dw­String­FormatA, so that lets me eliminate the call to Dw­String­LengthA altogether. However, this means that I have to be careful not to damage the value in eax before I get there.

            lea     ecx,[ebp-2E4h] ; was "lea eax,[ebp-2E4h]"
            push    ecx            ; was "push eax"
            push    5Ch
            nop                    ; \
            nop                    ; |
            nop                    ; |
            nop                    ; | was "push dword ptr [ebp-2E4h]"
            nop                    ; |
            nop                    ; /
            nop                    ; \
            nop                    ; |
            nop                    ; | was "call mnyob99!DwStringLengthA"
            nop                    ; |
            nop                    ; /
            sub     eax,7
            push    eax
            lea     eax,[ebp-101h]
            push    eax
    

    Patching the lea eax, ... to be lea ecx, ... can be done with a single byte, and the push eax is a single-byte instruction as well, so the first two patches can be done with one byte each. That leaves me with 11 bytes that need to be nop'd out.

    The naïve way of nopping out eleven bytes is simply to patch in 11 nop instructions, but you can do better by taking advantage of the bytes that are already there.

    ffb51cfdffff    push    dword ptr [ebp-2E4h]
    85b51cfdffff    test    dword ptr [ebp-2E4h],esi
    
    e8770a0000      call    mnyob99!DwStringLengthA
    b9770a0000      mov     ecx,0A77h
    

    By patching a single byte in each of the two instructions, I can turn them into effective nops by making them do nothing interesting. The first one tests the uninitialized variable against some garbage bits, and the second one loads a unused register with a constant. (Since the ecx register is going to be trashed by the call to FString­Find­CharacterA, we are free to modify it all we want prior to the call. No code could have relied on it anyway.)

    That second patch is a variation of one I called out some time ago, except that instead of patching out the call with a mov eax, immed32, we're using a mov ecx, immed32, because the value in the eax register is still important.

    Here's the final result:

            lea     ecx,[ebp-2E4h]           ; was "lea eax,[ebp-2E4h]"
            push    ecx                      ; was "push eax"
            push    5Ch
            test    dword ptr [ebp-2E4h],esi ; was "push dword ptr [ebp-2E4h]"
            mov     ecx,0a77h                ; was "call mnyob99!DwStringLengthA"
            sub     eax,7
            push    eax
            lea     eax,[ebp-101h]
            push    eax
    

    Bonus chatter: When I shared this patch with my friends, I mentioned that this patch made me feel like my retired colleague Jeff, who had a reputation for accomplishing astonishing programming tasks in his spare time. You would pop into his office asking for some help, and he'd fire up some program you'd never seen before.

    "What's that?" you'd ask.

    "Oh, it's a debugger I wrote," he'd calmly reply.

    Or you'd point him to a program and apologize, "Sorry, I only compiled it for x86. There isn't an Alpha version."

    "That's okay, I'll run it in my emulator," he'd say, matter-of-factly.

    (And retiring from Microsoft hasn't slowed him down. Here's an IBM PC Model 5150 emulator written in JavaScript.)

    Specifically, I said, "I feel like Jeff, who does this sort of thing before his morning coffee."

    Jeff corrected me. "If this was something I used to do before coffee, that probably meant I was up all night. Persistence >= talent."

  • The Old New Thing

    The debugger lied to you because the CPU was still juggling data in the air

    • 28 Comments

    A colleague was studying a very strange failure, which I've simplified for expository purpose.

    The component in question has the following basic shape, ignoring error checking:

    // This is a multithreaded object
    class Foo
    {
    public:
     void BeginUpdate();
     void EndUpdate();
    
     // These methods can be called at any time
     int GetSomething(int x);
    
     // These methods can be called only between
     // BeginUpdate/EndUpdate.
     void UpdateSomething(int x);
    
    private:
     Foo() : m_cUpdateClients(0), m_pUpdater(nullptr) { ... }
    
     LONG m_cUpdateClients;
    
     Updater *m_pUpdater;
    };
    

    There are two parts of the Foo object. One part that is essential to the object's task, and another part that is needed only when updating. The parts related to updating are expensive, so the Foo object sets them up only when an update is active. You indicate that an update is active by calling Begin­Update, and you indicate that you are finished updating by calling End­Update.

    // Code in italics is wrong
    void Foo::BeginUpdate()
    {
     LONG cClients = InterlockedIncrement(&m_cUpdateClients);
     if (cClients == 1) {
      // remember, error checking has been elided
      m_pUpdater = new Updater();
     }
     // else, we are already initialized for updating,
     // so nothing to do
    }
    
    void Foo::EndUpdate()
    {
     LONG cClients = InterlockedDecrement(&m_cUpdateClients);
     if (cClients == 0) {
      // last update client has disconnected
      delete m_pUpdater;
      m_pUpdater = nullptr;
     }
    }
    

    There are a few race conditions here, and one of them manifested itself in a crash. (If two threads call Begin­Update at the same time, one of them will increment the client count to 1 and the other will increment it to 2. The one which increments it to 1 will get to work initializing m_pUpdater, whereas the second one will run ahead on the assumption that the updater is fully-initialized.)

    What we saw in the crash dump was that Update­Something tries to use m_pUpdater and crashed on a null pointer. What made the crash dump strange was that if you actually looked at the Foo object in memory, the m_pUpdater was non-null!

        mov ecx, [esi+8] // load m_pUpdater
        mov eax, [ecx]   // load vtable -- crash here
    

    If you actually looked at the memory pointed-to by ESI+8, the value there was not null, yet in the register dump, ECX was zero.

    Was the CPU hallucinating? The value in memory is nonzero. The CPU loaded a value from memory. But the value it read was zero.

    The CPU wasn't hallucinating. The value it read from memory was in fact zero. The reason why you saw the nonzero value in memory was that in the time it took the null pointer exception to be raised, then caught by the debugger, the other thread managed to finish calling new Updater(), store the result back into memory, and then return back to its caller and proceed as if everything were just fine. Thus, when the debugger went to capture the memory dump, it captured a non-zero value in the dump, and the code which updated m_pUpdater was long gone.

    This type of race condition is more likely to manifest on multi-core machines, because on those types of machines, the two CPUs can have different views of memory. The thread doing the initialization can update m_pUpdater in memory, and other CPUs may not find out about it until some time later. The updated value was still in flight when the crash occurred. Before the debugger can get around to capturing the m_pUpdater member in the crash dump, the in-flight value lands, and what you see in the crash dump does not match what the crashing CPU saw.

  • The Old New Thing

    Security vulnerability reports as a way to establish your l33t kr3|)z

    • 29 Comments

    There is an entire subculture of l33t l4x0rs who occasionally pop into our world, and as such have to adapt their communication style to match their audience. Sometimes the adaptation is incomplete.

    I have appended a file exploit.pl which exploits a vulnerability
    in XYZ version N.M.  The result is a denial of service.
    The perl script generates a file, which if double-clicked,
    results in a crash in XYZ.
    
    S00PrA\/\/e$Um#!/usr/bin/perl
    
    system('cls');
    system('color c');
    system('title XYZ DOS Exploit');
    print('
    ----------------------------------------------------
    ****************************************************
    *              __                      $           *
    *   --        |  |     __             $$$          *
    *  |     - -  |__|    |  |           $     | |     *
    *   --  | | | |       |__| \  /\  /   $$$  | |     *
    *     |  - -  |   r   |  |  \/  \/ e     $  -  m   *
    *   --                |  |            $$$          *
    *                                      $           *
    ****************************************************
    ----------------------------------------------------
    ');
    
    sleep 2;
    system('cls');
    print('
    ----------------------------------------------------
    ****************************************************
    *                                      $           *
    *   --                |  |            $$$          *
    *     |  - -  |   L   |__|  /\  /\ 6     $  -  w   *
    *   --  | | | |__     |  | /  \/  \   $$$  | |     *
    *  |     - -  |  |    |__|           $     | |     *
    *   --        |__|                    $$$          *
    *                                      $           *
    ****************************************************
    ----------------------------------------------------
    
    The exploit!
    ');
    sleep 2;
    
    $theexploit = "\0";
    
    open(file, ">exploit.xyz");
    print(file $theexploit);
    
    system('cls');
    print('
    ----------------------------------------------------
    ****************************************************
    *              __                      $           *
    *   --        |  |     __             $$$          *
    *  |     - -  |__|    |  |           $     | |     *
    *   --  | | | |       |__| \  /\  /   $$$  | |     *
    *     |  - -  |   r   |  |  \/  \/ e     $  -  m   *
    *   --                |  |            $$$          *
    *                                      $           *
    ****************************************************
    ----------------------------------------------------
    
    DONE!
    
    Double-click exploit.xyz in XYZ and KABLOOEEYYY!
    ');
    
    sleep 3;
    
    system('cls');
    print('
    ----------------------------------------------------
    ****************************************************
    *              __                      $           *
    *   --        |  |     __             $$$          *
    *  |     - -  |__|    |  |           $     | |     *
    *   --  | | | |       |__| \  /\  /   $$$  | |     *
    *     |  - -  |   r   |  |  \/  \/ e     $  -  m   *
    *   --                |  |            $$$          *
    *                                      $           *
    ****************************************************
    ----------------------------------------------------
    
    CONSTRUCTED BY S00PrA\/\/e$Um
    
    Special thanks to: XploYtr & T3rM!NaT3R.
    ');
    

    You may have trouble finding the exploit buried in that perl script, because the perl script consists almost entirely of graffiti and posturing and chest-thumping. (You may also have noticed a bug.) Here is the script with all the fluff removed:

    $theexploit = "\0";
    
    open(file, ">exploit.xyz");
    print(file $theexploit);
    

    This could've been conveyed in a simple sentence: "Create a one-byte file consisting of a single null byte." But if you did that, then you wouldn't get your chance to put your name up in lights on the screen of a Microsoft security researcher!

    (For the record, the issue being reported was not only known, a patch for it had already been issued at the time the report came in. The crash is simply a self-inflicted denial of service with no security consequences. There isn't even any data loss because XYZ can open only one file at a time, so by the time it crashes, all your previous work must already have been saved.)

  • The Old New Thing

    When studying performance, you need to watch out not only for performance degradation, but also unexpected performance improvement

    • 5 Comments

    In addition to specific performance tests run by individual feature teams, Windows has a suite of automated performance tests operated by the performance team, and the results are collated across a lot of metrics. When a number is out of the ordinary, the test results are flagged for further investigation.

    The obvious thing that the performance metrics look for are sudden drops in performance. If an operation that used to consume 500KB of memory now consumes 750KB of memory, then you need to investigate why you're using so much memory all of a sudden. The reasons for the increase might be obvious, like "Oh, rats, there's a memory leak." Or they might be indirect, like "We changed our caching algorithm." Or "In order to address condition X, we added a call to function Y, but it turns out that function Y allocates a lot of memory." Or it could be something really hard to dig up. "We changed the timing of the scenario, so a window gets shown before it is populated with data, resulting in two render passes instead of one; the first pass is of an empty window, and then second is a when the data is present." Chasing down these elusive performance regressions can be quite time consuming, but it's part of the job.

    (End of exposition.)

    The non-obvious thing is that the performance metrics also look for sudden improvements in performance. Maybe the memory usage plummeted, or the throughput doubled. Generally speaking, a sudden improvement in performance has one of two sources.

    The first is the one you like: The explained improvement. Maybe the memory usage went down because you found a bug in your cache management policy where it hung onto stale data too long. Maybe the throughput improved because you found an optimization that let you avoid some expensive computations in a common case. Maybe you found and fixed the timing issue that was resulting in wasted render passes. (And then there are two types of explained improvements: the expected explained improvement, where something improved because you specifically targeted the improvement, and the unexpected explained improvement, where something improved as an understood side-effect of some other work.)

    The second is the one that you don't like: The unexplained improvement. The memory usage activity went down a lot, but you don't remember making any changes that affect your program's memory usage profile. Things got better but you don't know why.

    The danger here is that the performance gain may be the result of a bug. Maybe the scenario completed with half the I/O activity because the storage system is ignoring flush requests. Or it completed 15% faster because the cache is returning false cache hits.

    At the end of the day, when you finally understand what happened, you can then make an informed decision as to what to do about it. Maybe you can declare it an acceptable degradation and revise the performance baseline. ("Yes, we use more memory to render, but that's because we're using a higher-quality effects engine, and we consider the additional memory usage to be an acceptable trade-off for higher quality output.") Maybe you will look for an alternate algorithm that is less demanding on memory usage, or bypass calling function Y if it doesn't appear that condition X is in effect. Maybe you can offset the performance degradation by improving other parts of the component. Maybe the sudden performance improvement is a bug, or maybe it's an expected gain due to optimizations.

    But until you know why your performance profile changed, you won't know whether the change was good or bad.

    After all, if you don't know why your performance improved, how do you know that it won't degrade just as mysteriously? Today, you're celebrating that your memory usage dropped from 200MB to 180MB. Two weeks from now, when the mysterious condition reverts itself, you'll be trying to figure out why your memory usage jumped from 180MB to 200MB.

  • The Old New Thing

    Various ways of performing an operation asynchronously after a delay

    • 23 Comments

    Okay, if you have a UI thread that pumps messages, then the easiest way to perform an operation after a delay is to set a timer. But let's say you don't have a UI thread that you can count on.

    One method is to burn a thread:

    #define ACTIONDELAY (30 * 60 * 1000) // 30 minutes, say
    
    DWORD CALLBACK ActionAfterDelayProc(void *)
    {
     Sleep(ACTIONDELAY);
     Action();
     return 0;
    }
    
    BOOL PerformActionAfterDelay()
    {
     DWORD dwThreadId;
     HANDLE hThread = CreateThread(NULL, 0, ActionAfterDelayProc,
                                   NULL, 0, &dwThreadId);
     BOOL fSuccess = hThread != NULL;
     if (hThread) {
      CloseHandle(hThread);
     }
     return fSuccess;
    }
    

    Less expensive is to borrow a thread from the thread pool:

    BOOL PerformActionAfterDelay()
    {
     return QueueUserWorkItem(ActionAfterDelayProc, NULL,
                              WT_EXECUTELONGFUNCTION);
    }
    

    But both of these methods hold a thread hostage for the duration of the delay. Better would be to consume a thread only when the action is in progress. For that, you can use a thread pool timer:

    void CALLBACK ActionAfterDelayProc(void *lpParameter, BOOLEAN)
    {
     HANDLE *phTimer = static_cast<HANDLE *>(lpParameter);
     Action();
     DeleteTimerQueueTimer(NULL, *phTimer, NULL);
     delete phTimer;
    }
    
    BOOL PerformActionAfterDelay()
    {
     BOOL fSuccess = FALSE;
     HANDLE *phTimer = new(std::nothrow) HANDLE;
     if (phTimer != NULL) {
      if (CreateTimerQueueTimer(
         phTimer, NULL, ActionAfterDelayProc, phTimer,
         ACTIONDELAY, 0, WT_EXECUTEONLYONCE)) {
       fSuccess = TRUE;
      }
     }
     if (!fSuccess) {
      delete phTimer;
     }
     return fSuccess;
    }
    

    The timer queue timer technique is complicated by the fact that we want the timer to self-cancel, so it needs to know its handle, but we don't know the handle until after we've scheduled it, at which point it's too late to pass the handle as a parameter. In other words, we'd ideally like to create the timer, and then once we get the handle, go back in time and pass the handle as the parameter to Create­Timer­Queue­Timer. Since the Microsoft Research people haven't yet perfected their time machine, we solve this problem by passing the handle by address: The Create­Timer­Queue­Timer function fills the address with the timer, so that the callback function can read it back out.

    In practice, this additional work is no additional work at all, because you're already passing some data to the callback function, probably an object or at least a pointer to a structure. You can stash the timer handle inside that object. In our case, our object is just the handle itself. If you prefer to be more explicit:

    struct ACTIONINFO
    {
     HANDLE hTimer;
    };
    
    void CALLBACK ActionAfterDelayProc(void *lpParameter, BOOLEAN)
    {
     ACTIONINFO *pinfo = static_cast<ACTIONINFO *>(lpParameter);
     Action();
     DeleteTimerQueueTimer(NULL, pinfo->hTimer, NULL);
     delete pinfo;
    }
    
    BOOL PerformActionAfterDelay()
    {
     BOOL fSuccess = FALSE;
     ACTIONINFO *pinfo = new(std::nothrow) ACTIONINFO;
     if (pinfo != NULL) {
      if (CreateTimerQueueTimer(
         &pinfo->hTimer, NULL, ActionAfterDelayProc, pinfo,
         ACTIONDELAY, 0, WT_EXECUTEONLYONCE)) {
       fSuccess = TRUE;
      }
     }
     if (!fSuccess) {
      delete pinfo;
     }
     return fSuccess;
    }
    

    The threadpool functions were redesigned in Windows Vista to allow for greater reliability and predictability. For example, the operations of creating a timer and setting it into action are separated so that you can preallocate your timer objects (inactive) at a convenient time. Setting the timer itself cannot fail (assuming valid parameters). This makes it easier to handle error conditions since all the errors happen when you preallocate the timers, and you can deal with the problem up front, rather than proceeding ahead for a while and then realizing, "Oops, I wanted to set that timer but I couldn't. Now how do I report the error and unwind all the work that I've done so far?" (There are other new features, like cleanup groups that let you clean up multiple objects with a single call, and being able to associate an execution environment with a library, so that the DLL is not unloaded while it still has active thread pool objects.)

    The result is, however, a bit more typing, since there are now two steps, creating and setting. On the other hand, the new threadpool callback is explicitly passed the PTP_TIMER, so we don't have to play any weird time-travel games to get the handle to the callback, like we did with Create­Timer­Queue­Timer.

    void CALLBACK ActionAfterDelayProc(
        PTP_CALLBACK_INSTANCE, PVOID, PTP_TIMER Timer)
    {
     Action();
     CloseThreadpoolTimer(Timer);
    }
    
    BOOL PerformActionAfterDelay()
    {
     BOOL fSuccess = FALSE;
     PTP_TIMER Timer = CreateThreadpoolTimer(
                          ActionAfterDelayProc, NULL, NULL);
     if (Timer) {
      LONGLONG llDelay = -ACTIONDELAY * 10000LL;
      FILETIME ftDueTime = { (DWORD)llDelay, (DWORD)(llDelay >> 32) };
      SetThreadpoolTimer(Timer, &ftDueTime, 0, 0); // never fails!
      fSuccess = TRUE;
     }
     return fSuccess;
    }
    

    Anyway, that's a bit of a whirlwind tour of some of the ways of arranging for code to run after a delay.

  • The Old New Thing

    If you're asking somebody to help you, you want to make it easy for them, not harder

    • 36 Comments

    A customer liaison asked a question that went roughly like this:

    From: Liaison

    My customer is having a problem with the header file winfoo.h. The environment is Windows Vista German SP1 with Visual Studio 2008 Professional German SP1 and Windows SDK 6000.0.6000 German. When they try to include the file in their project, they get an error on line 42. "The character ';' was not expected here." Can somebody help?

    From: Raymond

    Can somebody please attach a copy of the offending file? There are many versions of winfoo.h, and it's not clear which version comes with Visual Studio 2008 Professional German SP1 + Windows SDK 6000.0.6000 German.

    I figured that'd be easier than provisioning a new virtual machine, obtaining temporary license keys for all the products, then installing Windows Vista German SP1, Visual Studio 2008 Professional German SP1, and Windows SDK 6000.0.6000 German. All to get one file. A file the customer already has right in front of them. Where all that's really interesting is line 42, and maybe a few lines surrounding it on either side.

    Time passes.

    The following day, another message arrives from the customer liaison.

    From: Liaison

    Anyone?

    At this point, I engaged my thermonuclear social skills.

    From: Raymond

    By "somebody", I meant "you".

    That went into some people's "Raymond Quotes" file.

    Remember, if you're asking somebody to help you, you want to make it easy for them, not harder.

    "Hey, Bob, I've got a jar of pickles, and I can't open the lid. Can you help?"

    Bob says, "Sure, I'll give it a try."

    "Great! The jar is in the storage room in the basement, on the third aisle, top shelf, second from the left."

    Bob is now suddenly less interested in helping you open that jar of pickles. Shouldn't you go get the jar of pickles and bring it to him?

  • The Old New Thing

    How does the window manager decide where to place a newly-created window?

    • 16 Comments

    Amit wonders how Windows chooses where to place a newly-opened window on a multiple-monitor system and gives as an example an application whose monitor choice appears inconsistent.

    The easy part is if the application specifies where it wants the window to be. In that case, the window is placed at the requested location. How the application chooses those coordinates is up to the application.

    On the other hand, if the application passes CW_USE­DEFAULT, this means that the application is saying, "I have no opinion where the window should go. Please pick a place for me."

    If this is the first top-level window created by the application with CW_USE­DEFAULT as its position, and the STARTF_USE­POSITION flag is set in the STARTUP­INFO, then use the position provided in the dwX and dwY members.

    Officially, that's all you're going to see in the documentation. Past this point is all implementation detail. I'm providing it here to satisfy your curiosity, but please don't write code that relies on it. (This is, I realize, a meaningless request, but I must go through the motions of making it anyway.)

    Okay, now let's dive into the various levels of automatic window positioning the window manager performs. Remember, these algorithms are not contractual and can change at any time. (In fact, they have changed in the past.) Just to make it harder to rely on this algorithm, I will not tell you which operating system implements the algorithm described below.

    From now on, assume that the application has specified CW_USE­DEFAULT as its position. Also assume that the window is a top-level window.

    First we have to choose a monitor.

    • If the window was created with an owner, then the window goes onto the monitor associated with the owner window. This tends to keep related windows together on the same monitor.
    • Else, if the process was created by the Shell­Execute­Ex function, and the SEE_MASK_HMONITOR flag was passed in the SHELL­EXECUTE­INFO structure, then the window goes onto the specified monitor.
    • Else, the window goes on the primary monitor.

    Next, we have to choose a location on that monitor.

    • If this is the first time we need to choose a default location on a monitor, or if the previous default location is too close to the bottom right corner of the monitor, then act as if the previous default location for the monitor was the upper left corner of the monitor.
    • The next default location on a monitor is offset from the previous default location, diagonally down and to the right.
      • The vertical offset is chosen so that the top edge of the new window lines up against the bottom of the previous window's caption.
      • The horizontal offset is chosen so that the left edge of the new window lines up against the right edge of the caption icon of the previous window.

    The effect of this algorithm is that if you open a bunch of default-positioned windows on a monitor, they line up in a pretty cascade marching down and to the right, until the cascade goes too far, and then they return to the upper left and resume cascading.

    Finally, after choosing a monitor and a location on the monitor, the selected location is adjusted (if possible) so that the window does not span monitors.

    And that's it, the default-window-positioning algorithm, as it existed in an unspecified version of Windows. Remember, this algorithm has been tweaked in the past, and it will get tweaked more in the future, so don't rely on it.

  • The Old New Thing

    Why are there both FIND and FINDSTR programs, with unrelated feature sets?

    • 35 Comments
    Jonathan wonders why we have both find and findstr, and furthermore, why the two programs have unrelated features. The find program supports UTF-16, which findstr doesn't; on the other hand, the findstr program supports regular expressions, which find does not.

    The reason why their feature sets are unrelated is that the two programs are unrelated.

    The find program came first. As I noted in the article, the find program dates back to 1982. When it was ported to Windows NT, Unicode support was added. But nobody bothered to add any features to it. It was intended to be a straight port of the old MS-DOS program.

    Meanwhile, one of my colleagues over on the MS-DOS team missed having a grep program, so he wrote his own. Developers often write these little tools to make their lives easier. This was purely a side project, not an official part of any version of MS-DOS or Windows. When he moved to the Windows 95 team, he brought his little box of tools with him, and he ported some of them to Win32 in his spare time because, well, that's what programmers do. (This was back in the days when programmers loved to program anything in their spare time.)

    And that's where things stood for a long time. The official find program just searched for fixed strings, but could do so in Unicode. Meanwhile, my colleague's little side project supported regular expressions but not Unicode.

    And then one day, the Windows 2000 Resource Kit team said, "Hey, that's a pretty cool program you've got there. Mind if we include it in the Resource Kit?"

    "Sure, why not," my colleague replied. "It's useful to me, maybe it'll be useful to somebody else."

    So in it went, under the name qgrep.

    Next, the Windows Resource Kit folks said, "You know, it's kind of annoying that you have to go install the Resource Kit just to get these useful tools. Wouldn't it be great if we put the most useful ones in the core Windows product?" I don't know what sort of cajoling was necessary, but they convinced the Windows team to add a handful of Resource Kit programs to Windows. Along the way, qgrep somehow changed its name to findstr. (Other Resource Kit programs kept their names, like where and diskraid.)

    So there you have it. You can think of the find and findstr programs as examples of parallel evolution.

  • The Old New Thing

    If you're going to write your own allocator, you need to respect the MEMORY_ALLOCATION_ALIGNMENT

    • 22 Comments

    This time, I'm not going to set up a story. I'm just going to go straight to the punch line.

    A customer overrode the new operator in order to add additional instrumentation. Something like this:

    struct EXTRASTUFF
    {
        DWORD Awesome1;
        DWORD Awesome2;
    };
    
    // error checking elided for expository purposes
    void *operator new(size_t n)
    {
      EXTRASTUFF *extra = (EXTRASTUFF)malloc(sizeof(EXTRASTUFF) + n);
      extra->Awesome1 = get_awesome_1();
      extra->Awesome2 = get_awesome_2();
      return ((BYTE *)extra) + sizeof(EXTRASTUFF);
    }
    
    // use your imagination to implement
    // operators new[], delete, and delete[]
    

    This worked out okay on 32-bit systems because in 32-bit Windows, MEMORY_ALLOCATION_ALIGNMENT is 8, and sizeof(EXTRASTUFF) is also 8. If you start with a value that is a multiple of 8, then add 8 to it, the result is still a multiple of 8, so the pointer returned by the custom operator new remains properly aligned.

    But on 64-bit systems, things went awry. On 64-bit systems, MEMORY_ALLOCATION_ALIGNMENT is 16, As a result, the custom operator new handed out guaranteed-misaligned memory.

    The misalignment went undetected for a long time, but the sleeping bug finally woke up when somebody allocated a structure that contained an SLIST_ENTRY. As we saw earlier, the SLIST_ENTRY really does need to be aligned according to the MEMORY_ALLOCATION_ALIGNMENT, especially on 64-bit systems, because 64-bit Windows takes advantage of the extra "guaranteed to be zero" bits that 16-byte alignment gives you. If your SLIST_ENTRY is not 16-byte aligned, then those "guaranteed to be zero" bits are not actually zero, and then the algorithm breaks down.

    Result: Memory corruption and eventually a crash.

  • The Old New Thing

    It rather involved being on the other side of this airtight hatchway: Silently enabling features

    • 41 Comments

    A security vulnerability report arrived which went roughly like this:

    When you programmatically enable the XYZ feature, the user receives no visual alert that it is enabled. As a result, malware can enable this feature and use it as part of an attempt to turn the machine into a botnet zombie. The XYZ feature should notify the user when it is enabled, so that to presence of malware is more easily determined.

    Okay, first of all, before we get to the security part of this issue, let's look at the user interface design. The proposed change is that, when the XYZ feature is enabled programmatically, the user receive a notification "XYZ is now enabled."

    You know what most users are going to do when they get that notification?

    Ignore it.

    There are two cases where XYZ can be programmatically enabled. The user may have enabled it themselves by, say, checking a checkbox, and the code that handles the checkbox turns around and programmatically enables the XYZ feature. In this case, the notification is an annoyance like my three-year-old niece who narrates every single thing she does. The user goes to the XYZ control panel, enables XYZ, and in response to the XYZ control panel enabling XYZ, the user gets a notification balloon that says "XYZ is now enabled."

    Well DUH.

    The other case is that the user did not enable it themselves, in which case the balloon is an annoyance because it says something that the user doesn't care about and probably doesn't even understand.

    "The tech tech tech is now tech tech tech."

    Displaying a notification doesn't really help. Either the user expects it, in which case it's an annoyance, or the user doesn't expect it, in which case they most likely won't understand it either, so it's still just an annoyance. (And taking no action leaves the feature enabled.)

    Okay, now let's look at the security aspect of this report. Enabling the XYZ feature requires administrator privileges, so any malware which successfully turns on the XYZ feature has already pwned your machine. It's already on the other side of the airtight hatchway.

    Displaying a warning when your machine is pwned doesn't accomplish anything: Since the malware already has complete control of the machine, it can patch out the code that displays the notification balloon. In other words, the only case in which the user actually sees the XYZ notification is when the user was expecting it to be turned on anyway, at which point you're just being a chatty Cathy.

    Exercise: "You can get rid of the notification in the case where the user enabled the feature manually adding a fSuppressWarnings parameter to the Enable­XYZ function, and have the code that handles the checkbox pass fSuppressWarnings = TRUE. That leaves only the second case, which is exactly the case we want the user to be annoyed." Discuss.

Page 1 of 3 (26 items) 123