• The Old New Thing

    Debugging walkthrough: Access violation on nonsense instruction, episode 2

    • 17 Comments

    A colleague of mine asked for help debugging a strange failure. Execution halted on what appeared to be a nonsense instruction.

    eax=0079f850 ebx=00000000 ecx=00000113 edx=00000030 esi=33ee06ef edi=74b9b8ad
    eip=00c0ac74 esp=0079f82c ebp=0079f86c iopl=0         nv up ei pl zr na pe nc
    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00000246
    00c0ac74 0000            add     byte ptr [eax],al          ds:002b:0079f850=74
    

    If you've been debugging x86 code for a while, you immediately recognize this instruction as "executing a page of zeroes". If you haven't been debugging x86 code for a while, you can see this from the code bytes in the second column.

    So how did we end up at this nonsense instruction?

    The instruction is not near a page boundary, so we didn't fall through to it. We must have jumped to it or returned to it.

    Since debugging is an exercise in optimism, let's assume that we jumped to it via a call instruction, and the return address is still on the stack.

    0:000> dps esp l2
    0079f82c  74b9b8b1 user32!GetMessageW+0x4
    0079f830  008f108b CONTOSO!MessageLoop+0xe7
    0:000> u user32!GetMessageW l3
    USER32!GetMessageW:
    74b9b8ad cc              int     3
    74b9b8ae ff558b          call    dword ptr [ebp-75h]
    74b9b8b1 ec              in      al,dx
    

    Well, that explains it. The code bytes for the Get­MessageW function were overwritten, causing us to execute garbage, and one of the garbage instructions was a call that took us to page of zeroes.

    But look more closely at the overwritten bytes.

    The first byte is cc, which is a breakpoint instruction. Hm...

    Since Windows functions begin with a MOV EDI, EDI instruction for hot patching purposes, the first two bytes are always 8b ff. If we unpatch the cc to 8b, we see that the rest of the code bytes are intact.

    USER32!GetMessageW:
    74b9b8ad 8bff            mov     edi,edi
    74b9b8af 55              push    ebp
    74b9b8b0 8bec            mov     ebp,esp
    

    After a brief discussion, we were able to piece together what happened:

    Somebody was trying to debug the CONTOSO application, so they connected a user-mode debugger to the application. Meanwhile, they set a breakpoint on user32!GetMessageW from the kernel debugger. Setting a breakpoint in a debugger is typically performed by patching an int 3 at the point where you want the breakpoint. When the int 3 fires, the debugger regains control and says, "Oh, thanks for stopping. Let me unpatch all the int 3's I put in the program to put things back the way they were."

    When the breakpoint hit, it was caught by the user-mode debugger, but since the user-mode debugger didn't set that breakpoint, it interpreted the int 3 as a hard-coded breakpoint in the application. At this point, the developer saw a spurious breakpoint, didn't know what it meant, and simply resumed execution. This executed the second half of the MOV EDI, EDI instruction as the start of a new instruction, and havoc ensued.

    That developer then asked his friend what happened, and his friend asked me.

    TL;DR: Be careful if you have more than one debugger active. Breakpoints set by one debugger will not be recognized by the other. If the breakpoint instruction is caught by the wrong debugger, things will go downhill fast unless you take corrective action. (In this case, it would be restoring the original byte.)

  • The Old New Thing

    Well at least nobody's parking there any more

    • 31 Comments

    There is a field next to the Microsoft building I used to work in, and for a time, people parked their cars out on the field, presumably because finding a proper parking space in the garage became difficult due to overcrowding. To prevent people from parking in the field, Security placed a large log across the access to the field. The technique worked: Nobody parked in the field any more.

    Some months later, our building had a fire drill, and everybody dutifully filed out of the building and waited until the all-clear signal was given to return. Normally, people would wait in the field, because that is the designated assembly area for the building, but oh wait, Security blocked off access to the field. Instead, people waited in the fire lane.

    Good thing this was just a drill, because they would have gotten run over by a fire truck.

    I pointed out to the Life Safety Security representative who was running the fire drill that Parking Security had created a life safety hazard. My observations were duly noted, and it looks like somebody actually paid attention, because a few weeks later, the log was removed. Now if there's a real fire, we can actually reach our designated assembly area.

    I just found it ironic that the Security department created a safety hazard.

  • The Old New Thing

    The publicity machine doesn't stop: TechNet podcast interview

    • 14 Comments

    The TechNet Magazine Podcast page has just posted their February 2007 entry, which includes an interview with little old me in the second half. I haven't listened to the whole interview yet, but what struck me immediately is that I was pretty darned punchy and goofy, whereas I think the host was trying to take a more serious tone. Oops.

  • The Old New Thing

    What is the strange garbage-looking string in the "command" value of a static verb?

    • 14 Comments

    A customer from a major software vendor asked, "What is the significance of the command value that can be found under HKCR\⟨progid⟩\shell\open\command. It appears to be a copy of the default value, but with the program name replaced with apparent garbage. We've seen this both with Microsoft products as well as products by other companies. There is no mention of this value in the documentation on static verbs."

    Name Type Data
    (Default) REG_SZ "C:\Program Files\Contoso\CONTOSO.exe" /NOLOGO "%1"
    command REG_MULTI_SZ 34GY`{XL?{Y)2S($,PP>c=@0l{Ja0N8KUwy@4JdO /NOLOGO "%1"

    The customer didn't explain why they were interested in this particular registry value. Maybe they thought it was enabling some super magical powers, and they wanted to get in on that action. (If that was the case, then they failed to notice that the same command value also existed in the verb registration for their own program!)

    That strange garbage-looking string was placed there by Windows Installer (also known as MSI). It is the so-called Darwin descriptor that Windows Installer uses to figure out what program to run when the verb is invoked by the shell. For compatibility with programs that read the registry directly (because everybody knows that reading the registry is much cooler than using the API), the default value is set to something approximating the local executable's path. That default value might be incorrect if the application has moved in the meantime, and it might be missing entirely if the application is marked as install-on-demand and has never been used, but at least it keeps those rogue programs working 99% of the time.

  • The Old New Thing

    Misleading advertisement: Passports or green cards?

    • 20 Comments

    I happened to spot an online advertisement for a company that will help you enter the lottery for a United States Permanent Resident Card, commonly known as a Green Card (even though they card isn't green any more). The advertisement was illustrated with a picture of a United States passport.

    Um, a Green Card is not the same as a passport, nor does a Green Card authorize you to obtain a passport. Passports are for citizens, not alien permanent residents.

  • The Old New Thing

    Sometimes sports-rule lawyering comes true: The strikeout with only one thrown pitch

    • 15 Comments

    Some time ago, I engaged in some sports-rule lawyering to try to come up with a way the losing team could manage to salvage a win without any remaining at-bats. It involved invoking a lot of obscure rules, but astonishingly one of the rules that I called upon was actually put into effect a few days ago.

    The Crawfish Boxes provides an entertaining rundown of the sequence of events. Here is the boring version:

    During his plate appearance, Vinnie Catricala was not pleased with the strike call on the first pitch he received. He exchanged words with the umpire, then stepped out of the batter's box to adjust his equipment. He did this without requesting or receiving a time-out. The umpire repeatedly instructed Catricala to take his position in the batter's box, which he refused to do. The umpire then called a strike on Catricala, pursuant to rule 6.02(c). Catricala, failing to comprehend the seriousness of the situation, still did not take his position in the batter's box, upon which the umpire called a third strike, thereby rendering him out.

    You can watch it for yourself.

    (Any discussion of this incident cannot be carried out without somebody referring to this Bugs Bunny cartoon, so there, I've done it so you don't have to.)

    I noted back in 2011 that the conventional way of implementing the automatic strike is for the umpire to direct the pitcher to throw a pitch, and to call it a strike no matter where it lands. This rule was revised in 2008 so that the umpire simply declares a strike without a pitch. This removes the deadlock situation I referred to in my earlier article, where the umpire instructs the pitcher to deliver a pitch, and the pitcher refuses. (The rule change also removes a bunch of wacky edge cases, like, "What if the pitcher throws the pitch as instructed by the umpire, and the batter jumps into the batter's box and hits a home run?")

    The revised rule 6.02(d)(1) specifically enumerates the eight conditions under which a batter is permitted to step out of the batter's box, none of which applied here.

    (Note that the rules of baseball stipulate that unless the umpire has granted Time, batters step out of the batter's box at their own risk. The ball is still live, and a pitch may be delivered.)

    Major League Baseball revised the rule in order to speed up the game, accompanying the rule change with instructions to umpires to enforce rules more vigilantly. Time between pitches is by far the largest chunk of wasted time in a baseball game, totalling well over an hour in a typical game. If you add the time between batters, you end up with over half of the elapsed time spent just waiting for something to start.

  • The Old New Thing

    Email tips from Adam Phillabaum

    • 3 Comments

    Adam Phillabaum of Doing Boeing (who kept the name even though he left Boeing and now works for PayScale) has his own tips for writing email. Recommended reading.

  • The Old New Thing

    Children's reactions to macadamia nuts dipped in chocolate

    • 12 Comments

    Some time ago, we had some people over our house for dinner, and we had a selection of desserts including chocolate-covered macadamia nuts. The children in attendance finished their dinner before the adults (because the adults were busy doing boring things like talking) and were excused to go play.

    It so happens that the play room is right next to the kitchen, and in the kitchen was the dessert table, including a bowl of chocolate-covered macadamia nuts, which the children managed to consume in their entirety before the adults finally finished talking and went to get dessert.

    The second half of the story didn't come until a few days later, when we were taking out the recycling. At the bottom of the recycling bin was a collection of macadamia nuts with all the chocolate sucked off.

    (I was reminded of this story by laonianren's comment on chocolate-covered Brazil nuts.)

  • The Old New Thing

    Enumerating set partitions with Bell numbers and Stirling numbers of the second kind

    • 15 Comments

    Just for fun, today's Little Program is going to generate set partitions. (Why? Because back in 2005, somebody asked about it on an informal mailing list, suggesting it would be an interesting puzzle, and now I finally got around to solving it.)

    The person who asked the question said,

    Below we show the partitions of [4]. The periods separate the individual sets so that, for example, 1.23.4 is the partition {{1},{2,3},{4}}.

    1 blocks: 1234
    2 blocks: 123.4,  124.3,  134.2,  1.234,  12.34,  13.24,  14.23
    3 blocks: 1.2.34,  1.24.3,  1.23.4,  14.2.3,  13.2.4,  12.3.4
    4 blocks: 1.2.3.4

    I replied with a hint, saying, "This page explains what you need to do, once you reinterpret the Stirling recurrence as an enumeration." Only now, writing up this post, did I realize that I linked to the same page they were quoting from.

    The key is in the sentence, "They satisfy the following recurrence relation, which forms the basis of recursive algorithms for generating them." Let's reinterpret the Stirling recurrence combinatorially. No wait, we don't need to. Wikipedia did it for us. Go read that first. If you didn't follow Wikipedia's explanation, here's another angle:

    Suppose you have a set of n elements, and you want to generate all the partitions into k blocks. Well, let's look at element n. What happens when you delete it from the partition?

    One possibility is that n was in a block all by itself. After you remove it, you are left with a partition of n − 1 elements into k − 1 blocks. Therefore, to generate the partitions where n is in a block all by itself, you generate all the partitions of n − 1 elements into k − 1 blocks, and then tack on a single block consisting of just element n.

    On the other hand, if element n was not in a block by itself, then removing it leaves a partition of n − 1 elements into k blocks. (Removing n did not decrease the number of blocks because there are still other numbers keeping that block alive.) Now, element n could have belonged to any of the k blocks, so you have k possible places where you could reinsert element n.

    Therefore, our algorithm goes like this:

    • Handle base cases.
    • Otherwise,
      • Recursively call S(n − 1, k) and add element n separately into each of the k blocks.
      • Recursively call S(n − 1, k − 1) and add a single block consisting of just n.

    Now that we spelled out what we're going to do, actually doing it is a bit anticlimactic.

    function Stirling(n, k, f) {
     if (n == 0 && k == 0) { f([]); return; }
     if (n == 0 || k == 0) { return; }
     Stirling(n-1, k-1, function(s) {
      f(s.concat([[n]])); // append [n] to the array
     });
     Stirling(n-1, k, function(s) {
      for (var i = 0; i < k; i++) {
       f(s.map(function(t, j) { // append n to the i'th subarray
        return t.concat(i == j ? n : []);
       }));
      }
     });
    }
    

    You can test out this function by logging the results to the console.

    function logToConsole(s) {
      console.log(JSON.stringify(s));
    }
    
    Stirling(4, 2, logToConsole);
    

    Let's walk through the function. The first test catches the vacuous base case where you say, "Please show me all the ways of breaking up nothing into zero blocks." The answer is, "There is exactly one way of doing this, which is to break it into zero blocks." Math is cute that way.

    The second test catches the other base cases where you say, "Please show me all the ways of breaking up n elements into zero blocks" (can't be done if n > 0, because you will always have elements left over), or you say, "Please show me all the ways of breaking up 0 elements into k blocks" (which also can't be done if k > 0 because there are no elements to put in the first block).

    After disposing of the base cases, we get to the meat of the recurrence. First, we ask for all the ways of breaking n - 1 elements into k - 1 blocks, and for each of them, we add a single block consisting of just n.

    Next, we ask for all the ways of breaking n - 1 elements into k blocks, and for each of them, we go into a loop adding n to each block. The goofy map creates a deep copy of the array and adds n to the ith block.

    Here's a walkthrough of the goofy map:

    • The concat method creates a new array consisting of the starting array with the concat parameters added at the end. If a parameter is an array, then its elements are added; otherwise the parameter itself is added. For example, [1].concat(2, [3, 4]) returns [1, 2, 3, 4]. The concat method creates a new array, and a common idiom is s.concat() to make a shallow copy of an array.
    • The map method calls the provided callback once for each element of the array s. The return values from all the callbacks are collected to form a new array, which is returned. For example, [1,2,3].map(function (v) { return v * 2; }) returns the new array [2, 4, 6].
    • The map callback is called with the subarray as the first parameter and the index as the second parameter. (There is also a third parameter, which we don't use.)
    • Therefore, if all we want to do was create a deep copy of s, we could write s.map(function (t, j) { return t.concat(); }).
    • But we don't want a perfect deep copy. We want to change the ith element. Therefore, we check the index, and if it is equal to the i, then we append n. Otherwise, we append [] which appends nothing.
    • After building the array (with modifications), we pass it to the callback function f.

    This pattern is common enough that maybe we should pull it into a helper function.

    function copyArrayOfArrays(array, indexToEdit, editor) {
     return array.map(function(e, i) {
      return i === indexToEdit ? editor(e) : e.concat();
     });
    }
    
    function Stirling(n, k, f) {
     if (n == 0 && k == 0) { f([]); return; }
     if (n == 0 || k == 0) { return; }
     Stirling(n-1, k-1, function(s) {
      f(s.concat([[n]])); // append [n] to the array
     });
     Stirling(n-1, k, function(s) {
      for (var i = 0; i < k; i++) {
       f(copyArrayOfArrays(s, i, function(e) { return e.concat(n); }));
      }
     });
    }
    

    The copy­Array­Of­Arrays function abstracts the goofy map: You give it an array of arrays, and optionally an index to edit and the function that does the editing. It copies the array of arrays and calls your editor on the element you want to edit.

    To reduce the number of memory allocations, the recursion could also be done destructively. You're then counting on the callback not modifying the result, since you're going to use it again.

    function Stirling(n, k, f) {
     if (n == 0 && k == 0) { f([]); return; }
     if (n == 0 || k == 0) { return; }
     Stirling(n-1, k, function(s) {
      for (var i = 0; i < k; i++) {
       s[i].push(n);
       f(s);
       s[i].pop();
      }
     });
     Stirling(n-1, k-1, function(s) {
      s.push([n]);
      f(s);
      s.pop();
     });
    }
    

    The original question was about enumerating all partitions (Bell numbers), and that's easy to put together from the Stirling numbers.

    function Bell(n, f) {
     for (var i = 1; i <= n; i++) {
      Stirling(n, i, f);
     }
    }
    
  • The Old New Thing

    Obtaining the parsing name (and pidl) for a random shell object

    • 5 Comments

    The parsing name for a shell item is handy, because it lets you regenerate the item later. Actually, the pidl for the shell item is even better, because that is the official way of saving and restoring objects. It's the pidl that gets saved in a shortcut, and since shortcuts can be copied around from machine to machine, pidls must be transportable and forward compatible. (A shortcut file created on Windows XP needs to keep working on all future versions of Windows.)

    Here's a handy little tool for grabbing the parsing name and pidl for a random shell object. Start with our scratch program, and add in the Simple­Drop­Target class, with the following tweaks:

    public:
     SimpleDropTarget() : m_cRef(1) { /* g_ppr->AddRef(); */ }
     ~SimpleDropTarget() { g_ppr->Release(); }
    
    ...
     // *** IDropTarget ***
     STDMETHODIMP DragEnter(IDataObject *pdto,
        DWORD grfKeyState, POINTL ptl, DWORD *pdwEffect)
     {
      *pdwEffect &= DROPEFFECT_LINK;
      return S_OK;
     }
    
     STDMETHODIMP DragOver(DWORD grfKeyState,
       POINTL ptl, DWORD *pdwEffect)
     {
      *pdwEffect &= DROPEFFECT_LINK;
      return S_OK;
     }
    ...
    };
    

    We are not a COM local server, so we won't worry about managing our process reference. And we will accept anything that has a pidl, so we say that we will accept objects via linking. (The original code accepted by copying, which would have made us reject non-copyable objects.)

    Now we can hook these up to our scratch program.

    BOOL
    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
    {
      g_hwndChild = CreateWindow(
          TEXT("edit"), nullptr, ES_MULTILINE |
          WS_CHILD | WS_VISIBLE | WS_TABSTOP,
          0, 0, 0,0, hwnd, (HMENU)1, g_hinst, 0);
      SimpleDropTarget *psdt = new(std::nothrow) SimpleDropTarget();
      if (psdt) {
        RegisterDragDrop(hwnd, psdt);
        psdt->Release();
      }
      return TRUE;
    }
    
    void
    OnDestroy(HWND hwnd)
    {
      RevokeDragDrop(hwnd);
      PostQuitMessage(0);
    }
    
    ...
        // Change CoInitialize and CoUninitialize to Ole
        if (SUCCEEDED(OleInitialize(NULL))) {
    ...
            OleUninitialize();
    

    Finally, we need to say what to do when the drop occurs.

    void AppendText(LPCWSTR psz)
    {
      SendMessageW(g_hwndChild, EM_REPLACESEL, 0, (LPARAM)psz);
    }
    
    void OpenFilesFromDataObject(IDataObject *pdto)
    {
      CComPtr<IShellItemArray> spsia;
      if (SUCCEEDED(SHCreateShellItemArrayFromDataObject(
                                      pdto, IID_PPV_ARGS(&spsia)))) {
        CComPtr<IEnumShellItems> spenum;
        spsia->EnumItems(&spenum);
        if (spenum) {
          for (CComPtr<IShellItem> spsi;
               spenum->Next(1, &spsi, nullptr) == S_OK;
               spsi.Release()) {
            CComHeapPtr<wchar_t> spszName;
            if (SUCCEEDED(spsi->GetDisplayName(
                         SIGDN_DESKTOPABSOLUTEPARSING, &spszName))) {
              AppendText(spszName);
              AppendText(L"\r\n");
            }
            CComHeapPtr<ITEMIDLIST_ABSOLUTE> spidl;
            if (SUCCEEDED(CComQIPtr<IPersistIDList>(spsi)->
                                                GetIDList(&spidl))) {
              UINT cb = ILGetSize(spidl);
              BYTE *pb = reinterpret_cast<BYTE *>
                              (static_cast<PIDLIST_ABSOLUTE>(spidl));
              for (UINT i = 0; i < cb; i++) {
                WCHAR szHex[4];
                StringCchPrintf(szHex, ARRAYSIZE(szHex),
                                L"%02X ", pb[i]);
                AppendText(szHex);
              }
              AppendText(L"\r\n");
            }
          }
        }
      }
    }
    

    When the drop occurs, we convert the data object into a shell item array, enumerate the items, and print the parsing name for the item as well as a hex dump of the pidl associated with the item.

    I guess we need some header files.

    #include <shlobj.h>
    #include <strsafe.h>
    #include <atlbase.h>
    #include <atlalloc.h>
    

    Run this program and drop the Recycle Bin onto it, say.

    ::{645FF040-5081-101B-9F08-00AA002F954E}
    14 00 1F 78 40 F0 5F 64 81 50 1B 10 9F 08 00 AA 00 2F 95 4E 00 00 
    

    This tells you two things. First, that if you want to generate the Recycle Bin from a parsing name, you can use that string that starts with two colons.

    var shell = new ActiveXObject("Shell.Application");
    var recycleBin = shell.Namespace(
          "::{645FF040-5081-101B-9F08-00AA002F954E}");
    var items = recycleBin.Items();
    for (var i = 0; i < items.Count; i++) {
     WScript.StdOut.WriteLine(items.Item(i));
    }
    

    Of course, there is a predefined enumeration for the Recycle Bin, so this was a bit of a waste. You could've just written

    var recycleBin = shell.Namespace(10);
    

    But this technique generalizes to other locations in the shell namespace that do not have a special shorthand value.

    The second thing the program tells you is that if you want to generate the Recycle Bin from a pidl, you can just use that chunk of bytes. Okay, that's not quite so interesting from a scripting point of view, but if you're manipulating pidls, this can be quite handy.

    We'll use this program a little bit in a few weeks, but at this point, it's just a "Little Program" for today.

Page 382 of 450 (4,494 items) «380381382383384»