• The Old New Thing

    Microspeak: move the needle

    • 6 Comments

    The phrase move the needle is part of general business jargon, but it is very popular here at Microsoft. You need to know what it means, and more importantly, you need to be willing to throw it around yourself in order to sound more hip and with-it.

    In general business speak, move the needle means generates a reaction, but at Microsoft, it has the more general sense of provide a perceptible improvement.

    The metaphor here is that there is some sort of meter, like a speedometer or VU meter. Back in the old days, these meters were analog rather than digital, and they consisted of a calibrated arc with a needle that pointed at the current value.

    To move the needle is to have a noticeable effect, presumably positive.

    Here are some citations.

    It's clear that we will need to make some additional improvements if we are to move the needle on performance for customers with high traffic.
    By investing in this solution we move the needle forward on reducing friction and increasing speed to measurable value.

    That second one wins the buzzword award for today.

  • The Old New Thing

    How do I enumerate remembered connections that are not currently connected?

    • 5 Comments

    Harry Johnston wanted to know how to get a list of remembered (but not currently connected) drive mappings.

    The idea here is to make a tweak to the Little Program. Start with what we had and make these changes:

    int __cdecl main(int, char **)
    {
     HANDLE hEnum;
     WNetOpenEnum(RESOURCE_REMEMBERED,
                  RESOURCETYPE_DISK,
                  0,
                  NULL,
                  &hEnum);
    
     ...
    }
    

    This changes the program from enumerating connected resources to enumerating remembered resources.

    The last step is to skip the remembered resources that are also connected. But this part is not Win32 programming; it's just programming, For each remembered resource, check if the lpLocal­Name is non-null and matches an lpLocal­Name that came out of an enumeration of connected resources.

    So let's do it. We start with the header files:

    #define UNICODE
    #define _UNICODE
    #define STRICT
    #include <windows.h>
    #include <stdio.h> // horrors! Mixing C and C++ I/O!
    #include <string>
    #include <set>
    #include <memory>
    #include <winnetwk.h>
    

    Since we are using classes like std::set which throw exceptions, we need to wrap our resources inside RAII classes. Here's one for network resource enumeration:

    class CNetEnumerator
    {
    public:
     CNetEnumerator() = default;
     ~CNetEnumerator() { if (m_hEnum) WNetCloseEnum(m_hEnum); }
     operator HANDLE() { return m_hEnum; }
     HANDLE* operator&() { return &m_hEnum; }
    private:
     HANDLE m_hEnum = nullptr;
    };
    

    Here is our function to enumerate all network resources. It uses a callback because arghhhhhhhhhhh wishes it were so.

    template<typename Callback>
    void for_each_network_resource(
        DWORD dwScope,
        DWORD dwType,
        DWORD dwUsage,
        LPNETRESOURCE pnrIn,
        Callback callback)
    {
     CNetEnumerator hEnum;
     WNetOpenEnum(dwScope, dwType, dwUsage, pnrIn, &hEnum);
    
     const DWORD elements = 65536 / sizeof(NETRESOURCE);
     static_assert(elements > 1, "Must have room for data");
     std::unique_ptr<NETRESOURCE> buffer(new NETRESOURCE[elements]);
    
     DWORD err;
     do {
      DWORD cEntries = INFINITE;
      DWORD cb = elements * sizeof(NETRESOURCE);
      err = WNetEnumResource(hEnum, &cEntries, buffer.get(), &cb);
      if (err == NO_ERROR || err == ERROR_MORE_DATA) {
       for (DWORD i = 0; i < cEntries; i++) {
        callback(&buffer[i]);
       }
      }
     } while (err == ERROR_MORE_DATA);
    }
    

    There is a bit of trickery to get the enumeration buffer into a form that C++ likes. We had previously used Local­Alloc, which is guaranteed to return memory suitably aligned for NETRESOURCE. However, we can't do it for new BYTE[], since that returns only byte-aligned data. We solve this problem by explicitly allocating NETRESOURCE objects, but choosing a number so that the result is close to our desired buffer size.¹

    We need another helper class so we can create a case-insensitive set.

    struct CaseInsensitiveWstring
    {
     bool operator()(const std::wstring& a, const std::wstring& b) const {
      return CompareStringOrdinal(a.c_str(), a.length(),
                                  b.c_str(), b.length(), TRUE) == CSTR_LESS_THAN;
     }
    };
    

    Okay, now we can start doing actual work:

    void report(PCWSTR pszLabel, PCWSTR pszValue)
    {
     printf("%ls = %ls\n", pszLabel, pszValue ? pszValue : L"(null)");
    }
    
    int __cdecl wmain(int, wchar_t **)
    {
     std::set<std::wstring, CaseInsensitiveWstring> connected;
    
     // Collect the local resources which are already connected.
     for_each_network_resource(RESOURCE_CONNECTED,
      RESOURCETYPE_DISK, 0, nullptr, [&](LPNETRESOURCE pnr) {
       if (pnr->lpLocalName != nullptr) {
        connected.emplace(pnr->lpLocalName);
       }
      });
    
     // Now look for remembered resources that are not connected.
     for_each_network_resource(RESOURCE_REMEMBERED,
      RESOURCETYPE_DISK, 0, nullptr, [&](LPNETRESOURCE pnr) {
       if (pnr->lpLocalName == nullptr ||
           connected.find(pnr->lpLocalName) == connected.end()) {
        report(L"localName", pnr->lpLocalName);
        report(L"remoteName", pnr->lpRemoteName);
        report(L"provider", pnr->lpProvider);
        printf("\n");
       }
      });
    
     return 0;
    }
    

    Not exciting. Mostly consists of boring typing. But hey, that's what programming is like most of the time.

    ¹ If we were being super-weenies about the buffer size, we could have written

     union EnumBuffer {
      BYTE bytes[65536];
      NETRESOURCE nr;
     };
    
     std::unique_ptr<EnumBuffer> buffer(new EnumBuffer());
     LPNETRESOURCE pnr = &buffer->nr;
     ...
      DWORD cb = sizeof(EnumBuffer);
    
  • The Old New Thing

    Debugging walkthrough: Access violation on nonsense instruction, episode 3

    • 22 Comments

    A colleague of mine asked for help debugging a strange failure. Execution halted on what appeared to be a nonsense instruction.

    eax=022b13a0 ebx=00000000 ecx=02570df4 edx=769f4544 esi=02570dec edi=05579748
    eip=76c49131 esp=05cce038 ebp=05cce07c iopl=0         nv up ei pl nz na po nc
    cs=0023  ss=002b  ds=002b  es=002b  fs=0053  gs=002b             efl=00010202
    KERNELBASE!GetFileAttributesExW+0x2:
    76c49131 ec              in      al,dx
    

    This is clearly an invalid instruction. But observe that the offset is +2, which is normally the start of the function, because the first two bytes of Windows operating system functions are a mov edi, edi instruction. Therefore, the function is corrupted. Lets look back two bytes to see if it gives any clues.

    0:006> u 76c49131-2
    KERNELBASE!GetFileAttributesExW:
    76c4912f e95aecebf3      jmp     IoLog!Mine_GetFileAttributesExW (6ab07d8e)
    

    Oh look, somebody is doing API patching (already unsupported) and they did a bad job. They tried to patch code while a thread was in the middle of executing it, resulting in a garbage instruction.

    This is a bug in IoLog. The great thing about API patching is that when you screw up, it looks like an OS bug. That way, nobody ever files bugs against you!

    (In this case, IoLog is a diagnostic tool which is logging file I/O performed by an application which is being instrumented.)

    My colleague replied, "Thanks. Looks like a missing lock in IoLog. It doesn't surprise me that API patching isn't supported..."

  • The Old New Thing

    The Windows 95 I/O system assumed that if it wrote a byte, then it could read it back

    • 15 Comments

    In Windows 95, compressed data was read off the disk in three steps.

    1. The raw compressed data was read into a temporary buffer.
    2. The compressed data was uncompressed into a second temporary buffer.
    3. The uncompressed data was copied to the application-provided I/O buffer.

    But you could save a step if the I/O buffer was a full cluster:

    1. The raw compressed data was read into a temporary buffer.
    2. The compressed data was uncompressed directly into the application-provided I/O buffer.

    A common characteristic of dictionary-based compression is that a compressed stream can contain a code that says "Generate a copy of bytes X through Y from the existing uncompressed data."

    As a simplified example, suppose the cluster consisted of two copies of the same 512-byte block. The compressed data might say "Take these 512 bytes and copy them to the output. Then take bytes 0 through 511 of the uncompressed output and copy them to the output."

    So far, so good.

    Well, except that if the application wrote to the I/O buffer while the read was in progress, then the read would get corrupted because it would copy the wrong bytes to the second half of the cluster.

    Fortunately, writing to the I/O buffer is forbidden during the read, so any application that pulled this sort of trick was breaking the rules, and if it got corrupted data, well, that's its own fault. (You can construct a similar scenario where writing to the buffer during a write can result in corrupted data being written to disk.)

    Things got even weirder if you passed a memory-mapped device as your I/O buffer. There was a bug that said, "The splash screen for this MS-DOS game is all corrupted if you run it from a compressed volume."

    The reason was that the game issued an I/O directly into the video frame buffer. The EGA and VGA video frame buffers used planar memory and latching. When you read or write a byte in video memory, the resulting behavior is a complicated combination of the byte you wrote, the values in the latches, other configuration settings, and the values already in memory. The details aren't important; the important thing is that video memory does not act like system RAM. Write a byte to video memory, then read it back, and not only will you not get the same value back, but you probably modified video memory in a strange way.

    The game in question loaded its splash screen by issuing I/O directly into video memory, knowing that MS-DOS copies the result into the output buffer byte by byte. It set up the control registers and the latches in such a way that then bytes written into memory go exactly where they should. (It issued four reads into the same buffer, with different control registers each time, so that each read ended up being issued to a different plane.)

    This worked great, unless the disk was compressed.

    The optimization above relied on the property that writing a byte followed by reading the byte produces the byte originally written. But this doesn't work for video memory because of the weird way video memory works. The result was that when the decompression engine tried to read what it thought was the uncompressed data, it was actually asking the video controller to do some strange operations. The result was corrupted decompressed data, and corrupted video data.

    The fix was to force double-buffering in non-device RAM if the I/O buffer was into device-mapped memory.

  • The Old New Thing

    Rules can exist not because there's a problem, but in order to prevent future problems

    • 19 Comments

    I lost the link, but one commenter noted that the Read­File function documentation says

    Applications must not read from, write to, reallocate, or free the input buffer that a read operation is using until the read operation completes.

    The commenter noted, "What is the point of the rule that disallows reading from or writing to the input buffer while the I/O is in progress? If there is no situation today where this actually causes a problem, then why is the rule there?"

    Not all rules exist to address current problems. They can also exist to prevent future problems.

    In general, you don't want the application messing with an I/O buffer because the memory may have been given to the device, and now the device has to deal with bus contention. And there isn't really much interesting you can do with the buffer before the I/O completes. You can't assume that the I/O will complete the first byte of the buffer first, and the last byte of the buffer last. The I/O request may get split into multiple pieces, and the individual pieces may complete out of order.

    So the rule against accessing the buffer while I/O is in progress is not a significant impediment in practice because you couldn't reliably obtain any information from the buffer until the I/O completed. And the rule leaves room for the future versions of the operating system to take advantage of the fact that the application will not read from or write to the buffer.

    Tomorrow, I'll tell a story of a case where accessing the I/O buffer before the I/O completed really did cause problems in Windows 95.

  • The Old New Thing

    Microspeak: DRI, the designated response individual

    • 7 Comments

    Someone sent a message to a peer-to-peer discussion group and remarked, "This is critical. I'm a DRI at the moment and have some issues to fix."

    The term DRI was new to many people on the mailing list (including me), and while others on the mailing list helped to solve the person's problem, I also learned that DRI stands for designated response individual or designated responsible individual, depending on whom you ask. This is the person who is responsible for monitoring and replying to email messages sent to a hot issues mailing list. For online services, it's the person responsible for dealing with live site issues.

    From what I can gather, teams that use this model rotate the job of being the DRI through the members of the team, so that each person on the schedule serves as DRI for a set period of time (typically a day or a week). The DRI may also be responsible for running various tools at specific times. Each team sets its own rules.

    Other teams have come up with their own name for this job. Another term I've seen is Point Dev. On our team, we call it the Developer of the Day.

    Bonus chatter: I bought this hat back in the day when the stitching was done by hand on a specially-designed sewing machine. Nowadays, it's computerized.

  • The Old New Thing

    Insightful graph: The ship date predictor

    • 51 Comments

    The best graphs are the ones that require no explanation. You are just told what the x- and y-axes represent, and the answer just jumps out at you.

    One of the greatest graphs I've seen at Microsoft is this one that a colleague of mine put together as Windows 95 was nearing completion. He took each email message from management that changed the Windows 95 RTM date (also known as the ship date) and plotted it on a chart. The x-axis is the date the statement was made and the y-axis is number of days remaining in the project, according to the email. The dotted line is a linear least-squares fit, and the green star is the actual ship date (July 14, 1995).

    600 
    500 
    400 
    300 
    200 
    100 
     
     
     
     
     
     
     
     
    Apr

    1992
    Jul
    Oct
    Jan 1993
    Apr
    Jul
    Oct
    Jan 1994
    Apr
    Jul
    Oct
    Jan 1995
    Apr
    Jul
    Oct

    What's so amazing about this chart is that the linear approximation predicts the actual ship date with very high accuracy. The slope of the line is 0.43%, which means that if you took the predicted "days remaining before we ship" and multiplied it by around 2.3, you'd be pretty close to the actual ship date.

    In other words, management fairly consistently underestimated the number of days until RTM by a factor of 2.3. (Another way of looking at it is that the development team consistently underreported the number of days to completion to management by a factor of 2.3.)

    Bonus amusement

    Here is a pull quote from each of the announcements, lightly edited.

    Date Revised RTM Remark
    February 1992 June 1993 "Ready to RTM 6/93. Otherwise, I'll be applying for a job at McDonalds."
    April 1992 September 1993 "This is a critical release."
    July 1992 March 1994 "The feature set will NOT be expanded to fill the new schedule."
    September 1992 December 1993 "This product must RTM by the end of 1993. If we miss this window of opportunity, then the value of this product goes way down."
    January 1993 March 1994 "I recently learned that Team X was planning around a Q4 94 ship date!" (Team X provided code to Windows 95.)
    March 1993 April 1994 "We need to formulate plans which get us there."
    August 1993 May 1994 "It's really important for the company that we make this date. This must be our last slip."
    December 1993 August 1994 "This is about as late as we can go without incurring big financial problems for the company."
    February 1994 September 1994 "What determines the ship date is the team's commitment to a ship date. We must make our RTM date."
    May 1994 November 1994 "Software and hardware vendors are counting on us."
    August 1994 February 1995 "Completing this milestone by the end of the year is absolutely critical to the product gaining quick success."
    December 1994 May 1995 "People all over are planning their business on when we release. We must make our current date."

    Today marks the 20th anniversary of the public release of Windows 95. Just one more year, and you'll be old enough to buy a drink!¹

    Bonus reading: Start Me Up (again): Brad Chase (who ran the worldwide launch of Windows 95) tells the story of how Start Me Up became the anthem for Windows 95, and addresses the legend that that it cost $14 million to license the song. (Spoiler: It was more like $3 million.)

    Bonus chatter: The ticket price for the Windows 95 team reunion party is $47.50. This seems like an odd number, but it makes more sense when you buy two tickets (one for you, and one for your partner).

    ¹ In the United States, the age at which it is legal to purchase alcohol is 21.

  • The Old New Thing

    Handy delegate shortcut hides important details: The hidden delegate

    • 13 Comments

    One of my colleagues was having trouble with a little tool he wrote.

    I installed a low-level keyboard hook following the code in this article, but it crashes randomly. Here's what I know so far:

    • I spawn a new STA thread to register the hook, so that it can run a message pump, which is a requirement for low-level hooks.
    • After setting the hook, the program waits on a Manual­Reset­Event with Wait­One(). Since this is being called from an STA thread, it will pump messages while waiting, which is what we want.
    • The event is signaled by another part of the program when the hook is no longer needed, at which point the thread unregisters the hook before exiting normally.

    The crash happens inside Wait­One() immediately after keyboard activity occurs. The debugger tells me that it is crashing trying to dispatch a call into a managed stub via the message pump, but that's all I was able to extract.

    I took a look at the article that my colleague referenced and observed that there was a subtlety in the code that not obvious, and which may have been lost in translation. I shared my observation in the form a psychic prediction.

    My psychic powers tell me that you did not prevent the delegate from getting GCd. The next time GC runs, the delegate will get collected, and the next attempt to fire the callback will AV because its calling into memory that has been freed.

    The sample code from the blog avoids this problem by putting the delegate in a private static, which makes it a GC root, ineligible for collection.

    private static LowLevelKeyboardProc _proc = HookCallback;
    

    This is subtle because the private static is decoupled from Set­Hook. If you copied Set­Hook but not the private static, then you inadvertently created a bug because local variables can get optimized out.

    Either put it in a static, like the sample does, or explicitly extend the delegates lifetime by calling GC.Keep­Alive() after you unhook the hook.

    LowLevelKeyboardProc proc = HookCallback;
    IntPtr hookId = SetHook(proc);
    WaitOne();
    RemoveHook(hookId);
    GC.KeepAlive(proc); // keep the proc alive until this line is reached
    

    My colleague realized that was the problem.

    I'd actually thought of that (mostly). I made my callback method itself a static, thinking that this was enough. What I forgot is that C# wraps that in an instance of the delegate automatically, and it was this hidden delegate that was getting GC'd not the callback function itself. This explains why I could always inspect the callback method and see that it was alive and well, yet we were still jumping into space when invoking the callback.

    Explicitly calling out the assignment reminded me of the details of delegates. Thanks!

    The classical notation for creating a delegate is

        DelegateType d = new DelegateType(o.Method);
        DelegateType d = new DelegateType(Method); // this.Method
    

    C# version 2.0 added delegate inference which lets you omit the new DelegateType most of the time. The compiler will automatically convert the method name (and optional this object) into a delegate.

        DelegateType d = o.Method;
        DelegateType d = Method; // this.Method
    

    This shorthand is so old, you may not even remember (or realize) that it is a shorthand for a hidden delegate.

    In my colleague's program, the line

        IntPtr hookId = SetHook(HookCallback);
    

    was shorthand for

        LowLevelKeyboardProc temp = HookCallback;
        IntPtr hookId = SetHook(temp);
    

    Once the delegate was made explicit rather than hidden, the issue became clear: Since there was nothing keeping the delegate alive, the delegate disappeared at the next GC, and the unmanaged function pointer disappeared with it.

    And now CLR Week will disappear until next time.

  • The Old New Thing

    I saw a pinvoke signature that passed a UInt64 instead of a FILETIME, what's up with that?

    • 12 Comments

    A customer had a question about a pinvoke signature that used a UInt64 to hold a FILETIME structure.

    [DllImport("kernel32.dll", SetLastError = true)
    static external bool GetProcessTimes(
        IntPtr hProcess,
        out UInt64 creationTime,
        out UInt64 exitTime,
        out UInt64 kernelTime,
        out UInt64 userTime);
    

    Is this legal? The documentation for FILETIME says

    Do not cast a pointer to a FILETIME structure to either a ULARGE_INTEGER* or __int64* value because it can cause alignment faults on 64-bit Windows.

    Are we guilty of this cast in the above code? After all you can't treat a FILETIME as an __int64.

    There are two types of casts possible in this scenario.

    • Casting from FILETIME* to __int64*.
    • Casting from __int64* to FILETIME*.

    The FILETIME structure requires 4-byte alignment, and the __int64 data type requires 8-byte alignment. Therefore the first cast is unsafe, because you are casting from a pointer with lax alignment requirements to one with stricter requirements. The second cast is safe because you are casting from a pointer with strict alignment requirements to one with laxer requirements.

    4-byte aligned 8-byte aligned

    Everything in the blue box is also in the pink box, but not vice versa.

    Which cast is the one occurring in the above pinvoke signature?

    In the above signature, the UInt64 is being allocated by the interop code, and therefore it is naturally aligned for UInt64, which means that it is 8-byte aligned. The Get­Process­Times function then treats those eight bytes as a FILETIME. So we are in the second case, where we cast from __int64* to FILETIME*.

    Mind you, you can avoid all this worrying by simply declaring your pinvoke more accurately. The correct solution is to declare the last four parameters as ComTypes.FILETIME. Now there are no sneaky games. Everything is exactly what it says it is.

    Bonus reading: The article Use PowerShell to access registry last-modified time stamp shows how to use the ComTypes.FILETIME technique from PowerShell.

  • The Old New Thing

    If you are going to call Marshal.GetLastWin32Error, the function whose error you're retrieving had better be the one called most recently

    • 12 Comments

    Even if you remember to set Set­Last­Error=true in your p/invoke signature, you still have to be careful with Marshal.Get­Last­Win32­Error because there is only one last-error code, and it gets overwritten each time.

    So let's try this program:

    using System;
    using System.Runtime.InteropServices;
    
    class Program
    {
      [DllImport("user32.dll", SetLastError=true)]
      public static extern bool OpenIcon(IntPtr hwnd);
    
      public static void Main()
      {
        // Intentionally pass an invalid parameter.
        var result = OpenIcon(IntPtr.Zero);
        Console.WriteLine("result: {0}", result);
        Console.WriteLine("last error = {0}",
                          Marshal.GetLastWin32Error());
      }
    }
    

    The expectation is that the call to Open­Icon will fail, and the error code will be some form of invalid parameter.

    But when you run the program, it prints this:

    result: False
    last error = 0
    

    Zero?

    Zero means "No error". But the function failed. Where's our error code? We printed the result immediately after calling Open­Icon. We didn't call any other p/invoke functions. The last-error code should still be there.

    Oh wait, printing the result to the screen involves a function call.

    That function call might itself do a p/invoke!

    We have to call Marshal.Get­Last­Win32­Error immediately after calling Open­Icon. Nothing else can sneak in between.

    using System;
    using System.Runtime.InteropServices;
    
    class Program
    {
      [DllImport("user32.dll", SetLastError=true)]
      public static extern bool OpenIcon(IntPtr hwnd);
    
      public static void Main()
      {
        // Intentionally pass an invalid parameter.
        var result = OpenIcon(IntPtr.Zero);
        var lastError = Marshal.GetLastWin32Error();
        Console.WriteLine("result: {0}", result);
        Console.WriteLine("last error = {0}",
                          lstError);
      }
    }
    

    Okay, now the program reports the error code as 1400: "Invalid window handle."

    This one was pretty straightforward, because the function call that modified the last-error code was right there in front of us. But there are other ways that code can run which are more subtle.

    • If you retrieve a property, the property retrieval may involve a p/invoke.
    • If you access a class that has a static constructor, the static constructor will secretly run if this is the first time the class is used.
Page 1 of 457 (4,570 items) 12345»