• The Old New Thing

    How do I show the contents of a directory while respecting the user's preferences for hidden and super-hidden files as well as the user's language preferences?

    • 16 Comments

    A customer was writing a program in (and this is what they said) "32 bit C++ .Net 4.0" which displayed the contents of a directory, and they wanted to filter out items such as hidden files and protected operating system files (also known as super-hidden files) based on the user's current Explorer preferences. Furthermore, they wanted to show localized folder names, such as Usarios instead of Users, again, the same way Explorer does. They are currently using Directory.Get­Directories().

    The way to do this is to use IShell­Folder::Enum­Object, the same way Explorer does. Don't pass SHCONTF_INCLUDE­HIDDEN or SHCONTF_INCLUDE­SUPER­HIDDEN, and you will get the default enumeration that filters out hidden items based on the user's preferences. (You pass the flag to force the items to be included, overriding the user's preferences.) and the names of the items that come out of the enumeration will be the localized names. You can ask for the parsing name to get the physical file name.

    #define UNICODE
    #define _UNICODE
    #define STRICT
    #define STRICT_TYPED_ITEMIDS
    #include <windows.h>
    #include <shlobj.h>
    #include <atlbase.h>
    #include <atlalloc.h>
    
    int __cdecl wmain(int argc, wchar_t **argv)
    {
     CCoInitialize init;
    
     if (argc < 2) return 0;
     CComHeapPtr<ITEMIDLIST_ABSOLUTE> sppidl;
     CComPtr<IShellFolder> spsf;
     CComPtr<IEnumIDList> speidl;
     if (FAILED(SHParseDisplayName(argv[1], nullptr,
                                   &sppidl, 0, nullptr)) ||
         FAILED(SHBindToObject(nullptr, sppidl,
                               nullptr, IID_PPV_ARGS(&spsf))) ||
         FAILED(spsf->EnumObjects(nullptr,
                   SHCONTF_FOLDERS | SHCONTF_NONFOLDERS, &speidl)) ||
         speidl == nullptr) return 0;
    
     for (CComHeapPtr<ITEMID_CHILD> sppidlItem;
          speidl->Next(1, &sppidlItem, nullptr) == S_OK;
          sppidlItem.Free()) {
      PrintDisplayName(spsf, sppidlItem, SHGDN_NORMAL, L"Display Name");
      PrintDisplayName(spsf, sppidlItem, SHGDN_FORPARSING, L"For Parsing");
      wprintf(L"\n");
     }
    }
    

    The program takes a fully-qualified path on the command line and displays its contents (both in localized display name and in raw file system paths) while respecting the user's preferences for hidden and super-hidden files.

    It appears that the customer is writing their program in C#, despite their claim that they were using C++ (or maybe they meant MC++ or C++/CLI). In that case, they can use the Windows 7 API CodePack for Microsoft® .NET Framework (gotta love that catchy name).

  • The Old New Thing

    How do I create an IShellItemArray from a bunch of file paths?

    • 7 Comments

    The IFile­Operation interface accepts bulk operations in the form of an IShell­Item­Array. So how do you take a list of file names and convert them into an IShell­Item­Array?

    There is no SHCreate­Shell­Item­Array­From­Paths function, but there is a SHCreate­Shell­Item­Array­From­ID­Lists, and we know how to convert a path to an ID list, namely via SHParse­Display­Name. So lets snap two blocks together.

    #define UNICODE
    #define _UNICODE
    #define STRICT
    #define STRICT_TYPED_ITEMIDS
    #include <windows.h>
    #include <shlobj.h>
    #include <wrl/client.h>
    
    // class CCoInitialize incorporated by reference
    
    template<typename T>
    HRESULT CreateShellItemArrayFromPaths(
        UINT ct, T rgt[], IShellItemArray **ppsia)
    {
     *ppsia = nullptr;
    
     PIDLIST_ABSOLUTE *rgpidl = new(std::nothrow) PIDLIST_ABSOLUTE[ct];
     HRESULT hr = rgpidl ? S_OK : E_OUTOFMEMORY;
    
     int cpidl;
     for (cpidl = 0; SUCCEEDED(hr) && cpidl < ct; cpidl++)
     {
      hr = SHParseDisplayName(rgt[cpidl], nullptr, &rgpidl[cpidl], 0, nullptr);
     }
    
     if (SUCCEEDED(hr)) {
      hr = SHCreateShellItemArrayFromIDLists(cpidl, rgpidl, ppsia);
     }
    
     for (int i = 0; i < cpidl; i++)
     {
      CoTaskMemFree(rgpidl[i]);
     }
    
     delete[] rgpidl;
     return hr;
    }
    

    The Create­Shell­Item­Array­From­Paths template function takes an array of paths and starts by creating a corresponding array of ID lists. (If you're feeling fancy, you can use a file system bind context to make simple ID lists.) It then pumps this array into the SHCreate­Shell­Item­Array­From­ID­Lists function to get the item array.

    Using a template allows you to pass an array of anything as the array of paths, as long as it has a conversion to PCWSTR. So you can pass an array of PCWSTR or an array of PWSTR or an array of BSTR or an array of CCom­Heap­Ptr<wchar_t> or an array of CStringW or whatever else floats your boat.

    Let's take this function out for a spin.

    int __cdecl wmain(int argc, wchar_t **argv)
    {
     CCoInitialize init;
    
     Microsoft::WRL::ComPtr<IShellItemArray> spsia;
     Microsoft::WRL::ComPtr<IFileOperation> spfo;
    
     if (SUCCEEDED(CreateShellItemArrayFromPaths(
                          argc - 1, argv + 1, &spsia)) &&
         SUCCEEDED(CoCreateInstance(__uuidof(FileOperation), nullptr,
                          CLSCTX_ALL, IID_PPV_ARGS(&spfo)))) {
      spfo->DeleteItems(spsia.Get());
      spfo->PerformOperations();
     }
    
     return 0;
    }
    

    The main program first treats the command line arguments as a list of absolute file paths and uses our new helper function to create a shell item array from them. It then passes the shell item array to the IFile­Operation::Delete­Items method to delete all the items.

    No magic here. Just taking the pieces available and combining them in a relatively obvious way.

  • The Old New Thing

    Why does the Directory.GetFiles method sometimes ignore *.html files when I ask for *.htm?

    • 68 Comments

    The documentation for the Directory.Get­Files method says

    When using the asterisk wildcard character in a search­Pattern, such as "*.txt", the matching behavior when the extension is exactly three characters long is different than when the extension is more or less than three characters long. A search­Pattern with a file extension of exactly three characters returns files having an extension of three or more characters, where the first three characters match the file extension specified in the search­Pattern. A search­Pattern with a file extension of one, two, or more than three characters returns only files having extensions of exactly that length that match the file extension specified in the search­Pattern. When using the question mark wildcard character, this method returns only files that match the specified file extension. For example, given two files, "file1.txt" and "file1.txtother", in a directory, a search pattern of "file?.txt" returns just the first file, while a search pattern of "file*.txt" returns both files.

    A customer reported that one of their programs stopped working, and they traced the problem to the fact that a search for *.htm on some machines was no longer return files like awesome.html, contrary to the documentation. What's going on?

    What's going on is that the documentation is trying too hard to explain an observed behavior. (My guess is that some other customer reported the behavior, and the documentation team incorporated the customer's observations into the documentation without really thinking it through.)

    The real issue is that the Get­Files method matches against both short file names and long file names. If a long file name has an extension that is longer than three characters, the extension is truncated to form the short file name. And it is that short file name that gets matched by *.htm or *.txt.

    Even as originally written, in the presence of short file names, the documentation is wrong, because it would imply that a search for reallylong*.txt could match reallylong_filename.txtother. But try it: It doesn't. That's because the short name is probably REALLY~1.TXT, and that doesn't match reallylong*.txt.

    What happened is that short file name generation was disabled on the drive at the time the files were created, so there was no short file name available, so there was consequently no SHORTN~1.HTM file to match against.

    The documentation should really say something more like this:

    Because this method checks against file names with both the 8.3 file name format (if available) and the long file name format, a search pattern like "*.txt" may return unexpected results. For example, the file longfilename.txtother may be returned if the short file name for the file is LONGFI~1.TXT.

    Update: It looks like the documentation has added my alternate remarks, but they kept the original misleading remarks as well, so now it's double-confusing. And to make things even more confusing, the original misleading remark has been made even more misleading in the part where it talks about question marks overriding the three-character rule. This is another failed attempt to explain observed behavior. If you search for "file?.txt", it will not match "file1.txtother". But the reason is not that the question mark overrides the three-character rule. The reason is that the short name for "file1.txtother" is "FILE1~1.TXT", and the question mark matches only one character.

  • The Old New Thing

    Operations jargon: Internet egress

    • 11 Comments

    As I've noted before, the operations team has their own jargon which superficially resembles English. Some time ago, they sent out a message with the subject A New Internet Egress Path Is Coming.

    Translation: We're changing the way computers access the Internet.

    Bonus jargon: traffic on the edge. This does not refer to traffic that is on the verge of a nervous breakdown. It merely refers to traffic that crosses the boundary between intranet and Internet.

  • The Old New Thing

    On live performances of Star Trek

    • 1 Comments

    Spock's Brain is generally considered to be the worst episode of Star Trek. That may be why in 2009 Mike Carano decided to perform it as a theatrical production. Here is the opening scene, and here's Carano talking about the show's genesis. In the second video, skip ahead to 2:40 to see more clips from the show, or go to 4:35 for the fight scene.

    Whereas Carano played the show for laughs, the folks at Atomic Arts in Portland (yes, that Portland) played it straight for their Trek in the Park series, but they still get laughs because Star Trek.

    2009 Amok Time
    2010 Space Seed
    2011 Mirror, Mirror
    2012 A Journey to Babel
    2013 The Trouble with Tribbles

    And yes, when the Enterprise is hit, everybody jerks to the left or right, even the unconscious bodies in sickbay.

    Their five-year mission complete, the Atomic Arts folks are unloading their larger set pieces, so if you always wanted a pair of sickbay beds that can pump Vulcan blood, well now you know where to go.

    But just because Trek in the Park is over doesn't mean you should give up on Star Trek live in the park yet. Seattle arts group Hello Earth Productions continues to stage Star Trek episodes in public parks under the name Outdoor Trek. Hello Earth aims for a more creative interpretation rather than trying to do a perfect impersonation of the original.

    2010 The Naked Time
    2011 This Side of Paradise
    2012 (hiatus)
    2013 Devil in the Dark (bonus Horta content)
    2014 Mirror, Mirror

    They are currently holding auditions for Mirror, Mirror.

  • The Old New Thing

    How do I disable zone markers for downloaded files, so that Explorer stops being a nag about running downloaded files and just trusts me to do the right thing?

    • 43 Comments

    My Little Program about manipulating the zone identifier for downloaded files appears to have struck a nerve with commenter Tess, who launched into some sort of diatribe about how Microsoft should stop being a busybody and warning users about opening files that they downloaded.

    You are welcome to disable the feature if it offends you so.

    In the Group Policy editor, go to User Configuration, Administrative Templates, Windows Components, Attachment Manager, and enable Do not preserve zone information in file attachments.

    For bonus points, you can set a bunch of other policies to make your computer even more dangerous. Here's a list of them. For example, if your goal is to create the most insecure deployment of Internet Explorer, you can set Inclusion list for moderate risk file types and Inclusion list for low risk file types both to *.*, and then on top of that, set Launching applications and unsafe files to Enabled (not secure) so that Internet Explorer never warns you about running anything.

    Welcome to 1995. Enjoy your stay.

  • The Old New Thing

    Why Johnny can't read music

    • 9 Comments

    In the book He Bear, She Bear, the musical instrument identified as a tuba is clearly a sousaphone.

    (For those who are wondering what the title has to do with the topic of musical instrument identification: It's a reference to the classic book Why Johnny Can't Read.)

  • The Old New Thing

    Programmatically uploading a file to an FTP site

    • 24 Comments

    Today's Little Program uploads a file to an FTP site in binary mode with the assistance of the Wininet library. This program has sat in my bag of tools for years.

    #define STRICT
    #define UNICODE
    #include <windows.h>
    #include <wininet.h>
    #include <shellapi.h>
    
    int __cdecl wmain(int argc, PWSTR argv[])
    {
     if (argc == 6) {
      HINTERNET hintRoot = InternetOpen(TEXT("ftpput/1.0"),
                INTERNET_OPEN_TYPE_DIRECT,
                NULL, NULL, 0);
      if (hintRoot) {
       HINTERNET hintFtp = InternetConnect(hintRoot,
                argv[1],
                INTERNET_DEFAULT_FTP_PORT,
                argv[2],
                argv[3],
                INTERNET_SERVICE_FTP,
                INTERNET_FLAG_PASSIVE,
                NULL);
       if (hintFtp) {
        FtpPutFile(hintFtp, argv[4], argv[5],
             FTP_TRANSFER_TYPE_BINARY,
             NULL);
    
        InternetCloseHandle(hintFtp);
       }
    
       InternetCloseHandle(hintRoot);
      }
     }
    
     return 0;
    }
    

    The program accepts five command line arguments:

    1. site (no "ftp://" in front)
    2. userid
    3. password
    4. path for the file to upload
    5. location to place the uploaded file

    For example, I might say ftpput ftp.contoso.com admin seinfeld newversion.zip subdir/newversion.zip

  • The Old New Thing

    Converting from a UTC-based SYSTEMTIME directly to a local-time-based SYSTEMTIME

    • 18 Comments

    Last year, I presented this commutative diagram

    A 2-by-2 grid of boxes. The top row is labeled FILE­TIME; the bottom row is labeled SYSTEM­TIME. The first column is labeled UTC; the second column is labeled Local. The upper left box is labeled Get­System­Time­As­File­Time. There is an outgoing arrow to the right labeled File­Time­To­Local­File­Time leading to the box in the second column labeled None. There is an outgoing arrow downward labeled File­Time­To­System­Time leading to the box in the second row, first column, labeled Get­System­Time. From the box in the upper right corner labeled None, there is an outgoing arrow downward labeled File­Time­To­System­Time leading to the box in the second row, second column, labeled Get­Local­Time.
    UTC
    Local
    FILE­TIME
    Get­System­Time­As­File­Time
    File­Time­To­Local­File­Time
    (None)
    File­Time­To­System­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    Get­System­Time
    Get­Local­Time

    I claimed that there was no function to complete the commutative diagram by connecting the bottom two boxes.

    I was wrong, but I'm going to try to get off on a technicality.

    You can connect the two boxes by calling System­Time­To­Tz­Specific­Local­Time with NULL as the time zone parameter, which means "Use the current time zone."

    The same diagram as above, but there is a new arrow connecting Get­System­Time to Get­Local­Time labeled System­Time­To­Tz­Specific­Local­Time.
    UTC
    Local
    FILE­TIME
    Get­System­Time­As­File­Time
    File­Time­To­Local­File­Time
    (None)
    File­Time­To­System­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    Get­System­Time
    System­Time­To­Tz­Specific­Local­Time
    Get­Local­Time

    This works here because the time being converted always refers to the current time.

    Here comes the technicality.

    This technique doesn't work in general because System­Time­To­Tz­Specific­Local­Time uses the time zone in effect at the time being converted, whereas the File­Time­To­Local­File­Time function uses the time zone in effect right now. Furthermore, it doesn't take into account changes in daylight savings rules that may have historically been different from the current set of rules. (Though this is easily repaired by switching to System­Time­To­Tz­Specific­Local­Time­Ex.) The trick works here because the time we are converting is right now.

    In other words, the more general diagram does not commute. Instead, it looks more like this:

    Same as before, but this time the boxes are unlabeled, and the bottom right box is split in two. The inbound arrow from the left goes to one box and the inbound arrow from the top goes to another box. The two halves of the split boxes are marked as not equal.
    UTC
    Local
    FILE­TIME
    File­Time­To­Local­File­Time
    File­Time­To­System­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    System­Time­To­Tz­Specific­Local­Time­Ex

    This is why the documentation for File­Time­To­Local­File­Time tells you that if you want to get from the upper left corner to the upper right corner while accounting for daylight saving time relative to the time being converted, then you need to take the long way around.

    So what we have is not so much a commutative diagram as a something like covering space: If you start at any box and travel around the diagram, you won't necessarily end up where you started. Let's start at the upper left corner for the sake of example.

    Back to the four-box diagram, with empty boxes. The arrows follow a clockwise path. From the upper left, we go to the upper right via File­Time­To­Local­File­Time, then to the bottom right via File­Time­To­System­Time, then to the bottom left via Tz­Specific­Local­Time­To­System­Time­Ex, then back to the upper left via Local­File­Time­To­File­Time.
    UTC
    Local
    FILE­TIME
    File­Time­To­Local­File­Time
    System­Time­To­File­Time
    File­Time­To­System­Time
    SYSTEM­TIME
    Tz­Specific­Local­Time­To­System­Time

    When you return to the upper left box, you might end up somewhere else, probably an hour ahead of or behind where you started. Each time you take a trip around the diagram, you drift another hour further away. Well, until you hit another daylight saving time changeover point.

  • The Old New Thing

    We're currently using FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH, but we would like our WriteFile to go even faster

    • 29 Comments

    A customer said that their program's I/O pattern is to open a file and then every so often write about 100KB of data into the file. They are currently using the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags to open a file, and they wanted to know what else they could do to make their writes go even faster.

    Um, for one thing, you stop passing those two flags!

    Those two flags in combination basically mean "Give me the slowest possible I/O performance!" because they force all I/O to go through to the physical media right away.

    Removing the FILE_FLAG_WRITE_THROUGH flag will be a big help. This allows the hardware disk cache to do its normal job of completing the I/O immediately and performing the physical I/O lazily (perhaps in an optimized order based on subsequent writes). A 100KB write is a small enough write that your I/O time on rotational media will be dominated by the seek time. It'll take five to ten milliseconds to move the head into position and only one millisecond to write out the data. You're wasting 80% or more of your time just preparing for the write.

    Much better would be to issue the I/O without the FILE_FLAG_WRITE_THROUGH flag so that the entire 100KB I/O request goes into the hard drive on-board cache. (It will fit quite easily, since the on-board cache for today's hard drives will be 8 megabytes or larger.) Your Write­File will complete immediately, and the commit to physical storage will occur while your program is busy doing computation.

    If the writes truly are sporadic (as the customer claims), the I/O buffer will be flushed out by the time the next round of application I/O begins.

    Removing the FILE_FLAG_NO_BUFFERING flag will also help, because that allows the operating system disk cache to get involved. If the application reads back from the file, the read can be satisfied from the disk cache, avoiding the physical I/O entirely.

    As a side note, the FILE_FLAG_WRITE_THROUGH flag is largely ineffective nowadays, because SATA drivers ignore the flush request. The file system doesn't know that the driver is lying to it, so it will still do all the work on the assumption that the write-through request worked, even though we know that the extra work is ultimately pointless.

    For example, NTFS will issue metadata writes with a flush to ensure that the data on the physical media is consistent. But if the driver is ignoring flush requests, all this extra work accomplishes nothing aside from wasting I/O bandwidth. Even worse, NTFS thinks that the data on the drive is physically consistent, but it isn't. The result is that a poorly-timed power outage (or device removal) can result in metadata corruption that takes a chkdsk to repair.

    Now, it may be that the customer's program is using the FILE_FLAG_NO_BUFFERING and FILE_FLAG_WRITE_THROUGH flags for a specific purpose unrelated to performance, so you can't just go walking in and ripping them out without understanding why they were there. But if they added the flags thinking that it would make the program run faster, then they were operating under a false assumption.

Page 4 of 419 (4,184 items) «23456»