• The Old New Thing

    Why Johnny can't read music

    • 9 Comments

    In the book He Bear, She Bear, the musical instrument identified as a tuba is clearly a sousaphone.

    (For those who are wondering what the title has to do with the topic of musical instrument identification: It's a reference to the classic book Why Johnny Can't Read.)

  • The Old New Thing

    Programmatically uploading a file to an FTP site

    • 24 Comments

    Today's Little Program uploads a file to an FTP site in binary mode with the assistance of the Wininet library. This program has sat in my bag of tools for years.

    #define STRICT
    #define UNICODE
    #include <windows.h>
    #include <wininet.h>
    #include <shellapi.h>
    
    int __cdecl wmain(int argc, PWSTR argv[])
    {
     if (argc == 6) {
      HINTERNET hintRoot = InternetOpen(TEXT("ftpput/1.0"),
                INTERNET_OPEN_TYPE_DIRECT,
                NULL, NULL, 0);
      if (hintRoot) {
       HINTERNET hintFtp = InternetConnect(hintRoot,
                argv[1],
                INTERNET_DEFAULT_FTP_PORT,
                argv[2],
                argv[3],
                INTERNET_SERVICE_FTP,
                INTERNET_FLAG_PASSIVE,
                NULL);
       if (hintFtp) {
        FtpPutFile(hintFtp, argv[4], argv[5],
             FTP_TRANSFER_TYPE_BINARY,
             NULL);
    
        InternetCloseHandle(hintFtp);
       }
    
       InternetCloseHandle(hintRoot);
      }
     }
    
     return 0;
    }
    

    The program accepts five command line arguments:

    1. site (no "ftp://" in front)
    2. userid
    3. password
    4. path for the file to upload
    5. location to place the uploaded file

    For example, I might say ftpput ftp.contoso.com admin seinfeld newversion.zip subdir/newversion.zip

  • The Old New Thing

    Converting from a UTC-based SYSTEMTIME directly to a local-time-based SYSTEMTIME

    • 18 Comments

    Last year, I presented this commutative diagram

    A 2-by-2 grid of boxes. The top row is labeled FILE­TIME; the bottom row is labeled SYSTEM­TIME. The first column is labeled UTC; the second column is labeled Local. The upper left box is labeled Get­System­Time­As­File­Time. There is an outgoing arrow to the right labeled File­Time­To­Local­File­Time leading to the box in the second column labeled None. There is an outgoing arrow downward labeled File­Time­To­System­Time leading to the box in the second row, first column, labeled Get­System­Time. From the box in the upper right corner labeled None, there is an outgoing arrow downward labeled File­Time­To­System­Time leading to the box in the second row, second column, labeled Get­Local­Time.
    UTC
    Local
    FILE­TIME
    Get­System­Time­As­File­Time
    File­Time­To­Local­File­Time
    (None)
    File­Time­To­System­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    Get­System­Time
    Get­Local­Time

    I claimed that there was no function to complete the commutative diagram by connecting the bottom two boxes.

    I was wrong, but I'm going to try to get off on a technicality.

    You can connect the two boxes by calling System­Time­To­Tz­Specific­Local­Time with NULL as the time zone parameter, which means "Use the current time zone."

    The same diagram as above, but there is a new arrow connecting Get­System­Time to Get­Local­Time labeled System­Time­To­Tz­Specific­Local­Time.
    UTC
    Local
    FILE­TIME
    Get­System­Time­As­File­Time
    File­Time­To­Local­File­Time
    (None)
    File­Time­To­System­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    Get­System­Time
    System­Time­To­Tz­Specific­Local­Time
    Get­Local­Time

    This works here because the time being converted always refers to the current time.

    Here comes the technicality.

    This technique doesn't work in general because System­Time­To­Tz­Specific­Local­Time uses the time zone in effect at the time being converted, whereas the File­Time­To­Local­File­Time function uses the time zone in effect right now. Furthermore, it doesn't take into account changes in daylight savings rules that may have historically been different from the current set of rules. (Though this is easily repaired by switching to System­Time­To­Tz­Specific­Local­Time­Ex.) The trick works here because the time we are converting is right now.

    In other words, the more general diagram does not commute. Instead, it looks more like this:

    Same as before, but this time the boxes are unlabeled, and the bottom right box is split in two. The inbound arrow from the left goes to one box and the inbound arrow from the top goes to another box. The two halves of the split boxes are marked as not equal.
    UTC
    Local
    FILE­TIME
    File­Time­To­Local­File­Time
    File­Time­To­System­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    System­Time­To­Tz­Specific­Local­Time­Ex

    This is why the documentation for File­Time­To­Local­File­Time tells you that if you want to get from the upper left corner to the upper right corner while accounting for daylight saving time relative to the time being converted, then you need to take the long way around.

    So what we have is not so much a commutative diagram as a something like covering space: If you start at any box and travel around the diagram, you won't necessarily end up where you started. Let's start at the upper left corner for the sake of example.

    Back to the four-box diagram, with empty boxes. The arrows follow a clockwise path. From the upper left, we go to the upper right via File­Time­To­Local­File­Time, then to the bottom right via File­Time­To­System­Time, then to the bottom left via Tz­Specific­Local­Time­To­System­Time­Ex, then back to the upper left via Local­File­Time­To­File­Time.
    UTC
    Local
    FILE­TIME
    File­Time­To­Local­File­Time
    System­Time­To­File­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    Tz­Specific­Local­Time­To­System­Time

    When you return to the upper left box, you might end up somewhere else, probably an hour ahead of or behind where you started. Each time you take a trip around the diagram, you drift another hour further away. Well, until you hit another daylight saving time changeover point.

  • The Old New Thing

    We're currently using FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH, but we would like our WriteFile to go even faster

    • 29 Comments

    A customer said that their program's I/O pattern is to open a file and then every so often write about 100KB of data into the file. They are currently using the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags to open a file, and they wanted to know what else they could do to make their writes go even faster.

    Um, for one thing, you stop passing those two flags!

    Those two flags in combination basically mean "Give me the slowest possible I/O performance!" because they force all I/O to go through to the physical media right away.

    Removing the FILE_FLAG_WRITE_THROUGH flag will be a big help. This allows the hardware disk cache to do its normal job of completing the I/O immediately and performing the physical I/O lazily (perhaps in an optimized order based on subsequent writes). A 100KB write is a small enough write that your I/O time on rotational media will be dominated by the seek time. It'll take five to ten milliseconds to move the head into position and only one millisecond to write out the data. You're wasting 80% or more of your time just preparing for the write.

    Much better would be to issue the I/O without the FILE_FLAG_WRITE_THROUGH flag so that the entire 100KB I/O request goes into the hard drive on-board cache. (It will fit quite easily, since the on-board cache for today's hard drives will be 8 megabytes or larger.) Your Write­File will complete immediately, and the commit to physical storage will occur while your program is busy doing computation.

    If the writes truly are sporadic (as the customer claims), the I/O buffer will be flushed out by the time the next round of application I/O begins.

    Removing the FILE_FLAG_NO_BUFFERING flag will also help, because that allows the operating system disk cache to get involved. If the application reads back from the file, the read can be satisfied from the disk cache, avoiding the physical I/O entirely.

    As a side note, the FILE_FLAG_WRITE_THROUGH flag is largely ineffective nowadays, because SATA drivers ignore the flush request. The file system doesn't know that the driver is lying to it, so it will still do all the work on the assumption that the write-through request worked, even though we know that the extra work is ultimately pointless.

    For example, NTFS will issue metadata writes with a flush to ensure that the data on the physical media is consistent. But if the driver is ignoring flush requests, all this extra work accomplishes nothing aside from wasting I/O bandwidth. Even worse, NTFS thinks that the data on the drive is physically consistent, but it isn't. The result is that a poorly-timed power outage (or device removal) can result in metadata corruption that takes a chkdsk to repair.

    Now, it may be that the customer's program is using the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags for a specific purpose unrelated to performance, so you can't just go walking in and ripping them out without understanding why they were there. But if they added the flags thinking that it would make the program run faster, then they were operating under a false assumption.

  • The Old New Thing

    Why do I have to add 1 to the color index when I set it as the hbrBackground of a window class?

    • 23 Comments

    Our scratch program sets the background color to COLOR_WINDOW by setting the class background brush as follows:

        wc.hbrBackground = (HBRUSH)(COLOR_WINDOW + 1);
    

    What's with the +1?

    Okay, first of all, let's backtrack a bit.

    The real first question is, "What's the deal with taking an integer (COLOR_WINDOW) and casting it to a HBRUSH and expecting anything sane to happen?"

    The window manager wants to provide multiple ways of setting the class background brush.

    1. The application can request that no automatic background drawing should occur at all.
    2. The application can request custom background drawing and provide that custom drawing by handling the WM_ERASE­BKGND message.
    3. The application can request that the background be a specific brush provided by the application.
    4. The application can request that the background be a specific system color.

    The first three cases are easy: If you don't want automatic background drawing, then pass the hollow brush. If you want custom background drawing, then pass NULL as the brush. And if you want background drawing with a specific brush, then pass that brush. It's the last case that is weird.

    Now, if Register­Class were being invented today, we would satisfy the last requirement by saying, "If you want the background to be a system color, then use a system color brush like this:

        wc.hbrBackground = GetSysColorBrush(COLOR_WINDOW);
    
    System color brushes match the corresponding system color, so this sets your background to whatever the current system window color is."

    But just as NASA couldn't use the Space Shuttle to rescue the Apollo 13 astronauts, the Register­Class function couldn't use Get­Sys­Color­Brush for class brushes: At the time Register­Class was designed, system color brushes had not yet been invented yet. In fact, they won't have been invented for over a decade.

    Therefore, Register­Class had to find some way of smuggling an integer inside a pointer, and the traditional way of doing this is to say that certain numerically-small pointer values are actually integers in disguise. We've seen this with the HINSTANCE returned by Shell­Execute, with the MAKE­INT­ATOM macro, with the MAKE­INT­RESOURCE/IS_INT­RESOURCE macro pair, and with the second parameter to the Get­Proc­Address function. (There are plenty of other examples.)

    The naïve solution would therefore be to say, "Well, if you want a system color to be used as the brush color, then just cast the COLOR_XXX value to an HBRUSH, and the Register­Class function will recognize it as a smuggled integer and treat it as a color code rather than an actual brush."

    And then you run into a problem: The numeric value of COLOR_SCROLL­BAR is zero. Casting this to a HBRUSH would result in a NULL pointer, but a NULL brush already means something else: Don't draw any background at all.

    To avoid this conflict, the Register­Class function artificially adds 1 to the system color number so that none of its smuggled integers will be mistaken for NULL.

  • The Old New Thing

    What order does the DIR command arrange files if no sort order is specified?

    • 28 Comments

    If you don't specify a sort order, then the DIR command lists the files in the order that the files are returned by the Find­First­File function.

    Um, okay, but that just pushes the question to the next level: What order does Find­First­File return files?

    The order in which Find­First­File returns files in unspecified. It is left to the file system driver to return the files in whatever order it finds most convenient.

    Now we're digging into implementation details.

    For example, the classic FAT file system simply returns the names in the order they appear on disk, and when a file is created, it is merely assigned the first available slot in the directory. Slots become available when files are deleted, and if no slots are available, then a new slot is created at the end.

    Modern FAT (is that an oxymoron?) with long file names is more complicated because it needs to find a sequence of contiguous entries large enough to hold the name of the file.

    There used to be (maybe there still are) some low-level disk management utilities that would go in and manually reorder your directory entries.

    The NTFS file system internally maintains directory entries in a B-tree structure, which means that the most convenient way of enumerating the directory contents is in B-tree order, which if you cover one eye and promise not to focus too closely looks approximately alphabetical for US-English. (It's not very alphabetical for most other languages, and it falls apart once you add characters with diacritics or anything outside of the Latin alphabet, and that includes spaces and digits!)

    The ISO 9660 file system (used by CD-ROMs) requires that directory entries be lexicographical sorted by ASCII code point. Pretty much everybody has abandoned the base ISO 9660 file system and uses one of its many extensions, such as Joliet or UDF, so you have that additional wrinkle to deal with.

    If you are talking to a network file system, then the file system on the other end of the network cable could be anything at all, so who knows what its rules are (if it even has rules).

    When people ask this question, it's usually in the context of a media-playing device which plays media from a CD-ROM or USB thumb drive in the raw native file order. But they don't ask this question right out; they ask some side question that they think will solve their problem, but they don't come out and say what their problem is.

    So let's solve the problem in context: If the storage medium is a CD-ROM or an NTFS-formatted USB thumb drive, then the files will be enumerated in sort-of-alphabetical order, so you can give your files names like 000 First track.mp3, 001 Next track.mp3, and so on.

    If the storage medium is a FAT-formatted USB thumb drive, then the files will be enumerated in a complex order based on the order in which files are created and deleted and the lengths of their names. But the easy way out is simply to remove all the files from a directory then move file files into the directory in the order you want them enumerated. That way, the first available slot is the one at the end of the directory, so the file entry gets appended.

    Of course, none of this behavior is contractual. NTFS would be completely within its rights to, for example, return entries in reverse alphabetical order on odd-numbered days. Therefore, you shouldn't write a program that relies on any particular order of enumeration. (Or even that the order of enumeration is consistent between two runs!)

  • The Old New Thing

    What two-year-olds think about when they are placed in time-out

    • 10 Comments

    My niece (two years old at the time) was put in the corner as punishment for some sort of misdeed. At the expiration of her punishment, her grandfather returned and asked her, "你乖唔乖?" (Are you going to be nice?)

    She cheerfully replied, "仲未乖!" (Still naughty!)

    In an unrelated incident, one of my honorary nieces was being similarly punished. She told her aunt who was passing nearby, "In a little while, my daddy is going to ask me if I'm sorry. I'm not really sorry, but I'm going to say that I am."

  • The Old New Thing

    Adventures in automation: Dismissing all message boxes from a particular application but only if they say that the operation succeeded

    • 13 Comments

    Suppose you have a program that is part of your workflow, and it has the annoying habit of showing a message box when it is finished. You want to automate this workflow, and part of that automation is dismissing the message box.

    Let's start by writing the annoying program:

    #include <windows.h>
    
    int WINAPI WinMain(
        HINSTANCE hinst, HINSTANCE hinstPrev,
        LPSTR lpCmdLine, int nCmdShow)
    {
      Sleep(5000);
      MessageBox(nullptr, GetTickCount() % 1000 < 800 ?
                 "Succeeded!" : "Failed!", "Annoying", MB_OK);
      return 0;
    }
    

    This annoying program pretends to do work for a little while, and then displays a message box saying whether or not it succeeded. (Let's say it succeeds 80% of the time.)

    Our Little Program will automate this task and respond based on whether the operation succeeded. This is just a small extension of our previous program which logs the contents of every message box, except we are paying attention only to one specific message box.

    To the rescue: UI Automation.

    using System.Windows.Automation;
    using System.Diagnostics;
    using System.Threading;
    
    class Program
    {
     [System.STAThread]
     public static void Main(string[] args)
     {
      int processId = 0;
      bool succeeded = false;
      var resultReady = new ManualResetEvent(false);
    
      Automation.AddAutomationEventHandler(
       WindowPattern.WindowOpenedEvent,
       AutomationElement.RootElement,
       TreeScope.Children,
       (sender, e) =>
       {
        var element = sender as AutomationElement;
    
        if ((int)element.GetCurrentPropertyValue(
                 AutomationElement.ProcessIdProperty) != processId)
        {
         return;
        }
    
        var text = element.FindFirst(TreeScope.Children,
          new PropertyCondition(AutomationElement.AutomationIdProperty,
                                "65535"));
        if (text != null && text.Current.Name == "Succeeded!")
        {
         succeeded = true;
         var okButton = element.FindFirst(TreeScope.Children,
           new PropertyCondition(AutomationElement.AutomationIdProperty,
                                 "2"));
         var invokePattern = okButton.GetCurrentPattern(
           InvokePattern.Pattern) as InvokePattern;
         invokePattern.Invoke();
        }
    
        resultReady.Set();
       });
    
      // Start the annoying process
      Process p = Process.Start("annoying.exe");
      processId = p.Id;
    
      // Wait for the result
      resultReady.WaitOne();
    
      if (succeeded)
      {
       Process.Start("calc.exe");
      }
    
      Automation.RemoveAllEventHandlers();
     }
    }
    

    Most of this program you've seen before.

    We register an automation event handler for new window creation that ignores windows that belong to processes we don't care about. That keeps us from being faked out by windows that happen to be created while our annoying task is running.

    For simplicity's sake, I've removed other sanity checks which verify that the window which appeared is the one we actually care about. In real life, we might check the window class or the window title. I left that out because it's not really relevant to the story.

    Once we think we have the window we want, we suck out the text so we can see whether the message was a success or failure message. If it was a success, we dismiss the dialog box by pushing the OK button; otherwise we leave the error message on the screen so the user can see what happened. Either way, we signal the main thread that we have a result.

    After registering the event handler, we run the annoying process, tell the event handler the process ID, and wait for the signal. Once the signal arrives, we see whether it declared the operation a success, and if so, we proceed to the next step of, um, say, launching the calculator. (I just picked something arbitrary.)

  • The Old New Thing

    How can I detect that my program was run from Task Scheduler, or my custom shortcut, or a service, or whatever

    • 53 Comments

    Suppose you want your program to behave differently depending on whether it is launched from the Start menu, or by clicking the pinned icon on the taskbar, or by Scheduled Task, or from a service, or whatever. How can a program detect and distinguish these scenarios?

    The answer is you don't. And you shouldn't try.

    Instead of trying to guess how your program was executed, you should have the launcher tell you how they are executing your program. You do this by registering a different command line for each of the scenarios, and then checking for that command line in the program. (We saw a variation of this a little while ago.)

    For example, you could have your Start menu shortcut contain one command line parameter, give the taskbar pinned shortcut a different command line parameter, register yet another command line parameter with the task scheduler, and have the service launch the program with a still different command line parameter.

    They all run the same program, but the command line parameter lets the program know what context it is being run in and alter its behavior accordingly.

    It's like creating multiple email addresses that all map to the same inbox. Many email services let you take an email address and insert a plus sign followed by anything else you like before the at-sign, and it'll all get delivered to the same inbox. The thing after the plus-sign is ignored for delivery purposes, but you can use it to help organize your inbox, so you know that the message sent to bob+expos@contoso.com is related to your fantasy baseball team, whereas bob+ff@contoso.com is something about your frequent flier account.

    One thing you shouldn't do is try to guess, however. Programs that magically change their behavior based on details of the environment lead to problems that are very difficult to debug.

    Given this discussion, perhaps you can provide guidance to this customer:

    How can my DLL detect that it is running inside a service?
  • The Old New Thing

    What does the SEE_MASK_UNICODE flag in ShellExecuteEx actually do?

    • 15 Comments

    Somebody with a rude name wonders what the SEE_MASK_UNICODE flag does.

    It does nothing.

    The flag was introduced when porting the Windows 95 shell to Windows NT. It happened further back in history than I have permission to access the Windows source code history database, but I can guess how it got introduced.

    One of the things that the porting team had to do was make Unicode versions of all the ANSI functions that Windows 95 created. Sometimes this was done by creating separate A and W versions of a function. Sometimes this was done by having separate A and W versions of an interface. Sometimes by adding additional fields to the A version of a structure with a flag that says whether the ANSI or Unicode members should be used.

    My guess is that the porting team initially decided to make Shell­Execute­Ex use that third model, where the SHELL­EXECUTE­INFO structure had a SHELL­EXECUTE­INFO­EX extension with Unicode strings, and the mask specified whether the caller preferred you to use the ANSI strings or the Unicode strings.

    Presumably they decided to change course and switch to having separate SHELL­EXECUTE­INFOA and SHELL­EXECUTE­INFOW structures. But when they switched from one model to the other, they left that flag behind, probably with the intention of removing it once all existing callers had been updated to stop passing the flag, but they never managed to get around to it.

    So the flag is just sitting in the header file even though nobody pays any attention to it.

Page 5 of 419 (4,188 items) «34567»