• The Old New Thing

    If you want to set a thread's apartment model via Thread.CurrentThread.ApartmentState, you need to act quickly

    • 18 Comments

    Welcome to CLR Week 2014. Don't worry, it'll be over in a few days.

    A customer wanted to know why their Folder­Browser­Dialog was displaying the infamous Current thread must be set to single thread apartment (STA) mode before OLE calls can be made error.

    private void btnBrowseFolder_Click(object sender, System.EventArgs e)
    {
      Thread.CurrentThread.ApartmentState = ApartmentState.STA;
      FolderBrowserDialog fbd = new FolderBrowserDialog {
        RootFolder = System.Environment.SpecialFolder.MyComputer,
        ShowNewFolderButton = true,
        Description = "Select the awesome folder..."
      };
      DialogResult dr = fbd.ShowDialog();
      ...
    }
    

    "Even though we set the Apartment­State to STA, the apartment state is still MTA. Curiously, if we put the above code in a standalone test program, it works fine."

    The problem is that the customer is changing the apartment state too late.

    On the first call to unmanaged code, the runtime calls Co­Initialize­Ex to initialize the COM apartment as either an MTA or an STA apartment. You can control the type of apartment created by setting the System.Threading.ApartmentState property on the thread to MTA, STA, or Unknown.

    Notice that the value you specify in Current­Thread.Apartment­State is consulted at the point the runtime initializes the COM apartment (which occurs on the first call to unmanaged code). If you change it after the COM apartment has been initialized, you're revising the blueprints of a house after it has been built.

    The standard way to avoid this problem is to attach the [STAThread] attribute to your Main function, or if you need to set the apartment model of a thread you created yourself, call the Thread.Set­Apartment­State method before the thread starts.

  • The Old New Thing

    The case of the orphaned LpdrLoaderLock critical section

    • 11 Comments

    A customer found that under heavy load, their application would occasionally hang. Debugging the hang shows that everybody was waiting for the Lpdr­Loader­Lock critical section. The catch was that the critical section was reported as locked, but the owner thread did not exist.

    0:000> !critsec ntdll!LdrpLoaderLock
    
    CritSec ntdll!LdrpLoaderLock+0 at 7758c0a0
    WaiterWoken        No
    LockCount          4
    RecursionCount     2
    OwningThread       150c
    EntryCount         0
    ContentionCount    4
    *** Locked
    
    0:000> ~~[150c]k
                  ^ Illegal thread error in '~~[150c]k'
    

    How can a critical section be owned by thread that no longer exists?

    There are two ways this can happen. One is that there is a bug in the code that manages the critical section such that there is some code path that takes the critical section but forgets to release it. This is unlikely to be the case for the loader lock (since a lot of really smart people are in charge of the loader lock), but it's a theoretical possibility. We'll keep that one in our pocket.

    Another possibility is that the code to exit the critical section never got a chance to run. For example, somebody may have thrown an exception across the stack frames which manage the critical section, or somebody may have tried to exit the thread from inside the critical section, or (gasp) somebody may have called Terminate­Thread to nuke the thread from orbit.

    I suggested that the Terminate­Thread theory was a good one to start with, because even if it wasn't the case, the investigation should go quickly because the light is better. You're not so much interested in finding it as you are in ruling it out quickly.¹

    The customer replied, "We had one call to Terminate­Thread in our application, and we removed it, but the problem still occurs. Is it worth also checking the source code of the DLLs our application links to?"

    Okay, the fact that they found one at all means that there's a good chance others are lurking.

    Before we could say, "Yes, please continue your search," the customer followed up. "We found a call to Terminate­Thread in a component provided by another company. The code created a worker thread, and decided to clean up the worker thread by terminating it. We commented out the call just as a test, and it seems to fix the problem. So at least now we know where the problem is and we can try to fix it properly."

    ¹ In the medical profession, there's the term ROMI, which stands for rule out myocardial infarction. It says that if a patient comes to you with anything that could possibly remotely be the symptom of a heart attack, like numbness in the arm or chest pain, you should perform a test to make sure. Even though the test is probably going to come back negative, you have to check just to be safe. Here, we're ruling out Terminate­Thread, which I guess could go by the dorky acronym ROTT.

  • The Old New Thing

    The latest technologies in plaintext editing: NotepadConf

    • 13 Comments

    On November 13, 2014 November 14, 2014, Saint Paul, Minnesota will be the home to NotepadConf, which bills itself as "the premier technology conference for Notepad.exe users and text enthusiasts."

    I'm still not sure whom Microsoft will send to the conference, but maybe that person could give a talk on how you can use Notepad to take down the entire Internet.

    Update: The conference has been rescheduled to Friday the 14th.

  • The Old New Thing

    Since clean-up functions can't fail, you have to soldier on

    • 20 Comments

    Clean-up functions can't fail, so what do you do if you encounter a failure in your clean-up function?

    You have to keep cleaning up.

    Some people like to follow this pattern for error checking:

    HRESULT Function()
    {
      hr = SomeFunction();
      if (FAILED(hr))) goto Exit;
    
      hr = AnotherFunction();
      if (FAILED(hr))) goto Exit;
    
      ... and so on until ...
    
      hr = S_OK;
    
    Exit:
        return hr;
    }
    

    And some like to put it inside a cute flow control macro like

    #define CHECK_HRESULT(hr) if (FAILED(hr)) goto Exit;
    

    or even

    #define CHECK_HRESULT(f) if (FAILED(hr = (f))) goto Exit;
    

    Whatever floats your boat.

    But you have to be careful if using this pattern in a clean-up function, because you might end up not actually cleaning up. For example:

    HRESULT Widget::Close()
    {
        HRESULT hr;
        CHECK_HRESULT(DisconnectDoodad(m_hDoodad));
        m_hDoodad = nullptr;
        for (int i = 0; i < ARRRAYSIZE(GadgetArray); i++) {
            CHECK_HRESULT(DestroyGadget(m_rghGadget[i]));
            m_rghGadget[i] = nullptr;
        }
    
        hr = S_OK;
    
    Exit:
        return hr;
    }
    

    What if there is an error disconnecting the doodad? (Maybe you got RPC_E_SERVER_DIED because the doodad lives on a remote server which crashed.) The cleanup code treats this as an error and skips destroying the gadget. But what can the caller do about this? Nothing, that's what. Eventually you get a bug that says, "On an unreliable network, we leak gadgets like crazy."

    Or worse, what if you're doing this in your destructor. You have nowhere to report the error. The caller simply expects that when the object is destroyed, all its resources are released.

    So release as much as you can. If something goes wrong with one of them, keep going, because there's still other stuff to clean up.

    Related: Never throw an exception from a destructor.

    Bonus chatter: Yes, I know that you can avoid this problem by wrapping the Doodad and Gadget handles inside a class which disconnects/destroys on destruction. That's not my point.

  • The Old New Thing

    Why does Explorer say "File too large" for my custom file system, when the problem has nothing to do with the file being too large (heck it's not even a file)

    • 34 Comments

    When Explorer copies files around, it doesn't really know what the maximum file size supported by any file system happens to be. (That information is not reported by Get­Volume­Information.) So it guesses.

    If the file system name is "FAT" or "FAT32", then Explorer assumes that the maximum file size is 4GB − 1.

    Also, if a file operation fails with the error ERROR_INVALID_PARAMETER, and Explorer can't figure out why the parameter is invalid, it assumes that the reason is that the file has exceeded the maximum allowed file size.

    Why does Explorer map "invalid parameter" to "file size too large"? Because some file systems use ERROR_INVALID_PARAMETER to report that a file is too large instead of the somewhat more obvious ERROR_FILE_TOO_LARGE.

    Therefore, if you're implementing a file system, and you're getting these spurious "File too large" errors, one thing to check is whether you are reporting "invalid parameter" for a case where all the parameters are actually valid, but something else prevents you from doing what you want. (Maybe "access denied" would be a better error code.)

  • The Old New Thing

    Microspeak: 1 – 1 is not zero

    • 28 Comments

    In his reddit AMA, Joe Belfiore wrote

    i have regular 1-1 meetings with my counterparts in Office, Skype, Xbox.

    The little bit of jargon there is 1-1 meeting. This is an abbreviation for one-on-one meeting, a common business practice wherein two people, typically a manager and a direct report, have a face-to-face meeting with no one else present. In the case Joe used, the meeting is not between a manager and a direct report but between two peers.

    The term is also abbreviated 1:1, which like 1 − 1 also looks like a bit of mathematical nonsense. But it's not zero or one. It's just an abbreviation for business jargon.

    Of course, Microspeak isn't happy with the abbreviation as-is. Microspeak takes it further and nounifies the abbreviation. "I'll bring that up in my 1–1 with Bob tomorrow" means "I'll bring that up in my one-on-one meeting with Bob tomorrow."

    Note that the abbreviation OOO is not used to mean one-on-one. That abbreviation is used to mean Out of the office, although it is more commonly abbreviated OOF at Microsoft.

  • The Old New Thing

    Enumerating the ways of choosing teams in a group of players

    • 16 Comments

    Suppose you have a bunch of people, and you want to break them up into m teams of size n. (Therefore you have a total of nm people.) Today's Little Program will enumerate the ways this can be done.

    Formally, let's say that you have a collection of size nm, and you want to enumerate the ways of partitioning the collection into m subsets, each subset of size n. The order of elements within each subset does not matter, and the order of the subsets doesn't matter. That's saying that a team of Alice and Bob is the same as a team of Bob and Alice, and Alice-Bob versus Charlie-David is the same as Charlie-David versus Alice-Bob.

    The number of ways of doing this is (nm)!/n!mm!. You can see this by first taking all permutations of the players, then dividing out by the things that cause us to overcount: The number of ways of ordering players within each team is n!, and there are m teams, and there are m! ways of ordering the teams themselves. (Note that this is a cute way of expressing the result, but you shouldn't use it for computation. A slightly better way for computation would be (Π1 ≤ knC(mk, m))/m!.

    Okay, but how do you generate the teams themeselves?

    Let's first see how to generate the first team. Well, that's easy. You just select n players and call them Team 1.

    This leaves you n(m − 1) players with which to form m − 1 teams, which you can do recursively.

    function Teams(n, m, f) {
     var a = [];
     for (var i = 1; i <= n * m; i++) {
      a.push(i);
     }
    
     if (m == 1) { f([a]); return; }
    
     Subsets(n * m, n, function(s) {
      var rest = a.filter(function(i) { return s.indexOf(i) < 0; });
      Teams(n, m - 1, function(t) {
        f([s].concat(t.map(function(team) {
            return team.map(function(i) { return rest[i-1]; });
           })));
      });
     });
    }
    
    Teams(2, 3, logToConsole);
    

    The first part of this function builds an array of the form [1, 2, 3, ..., n * m]. If we are asking for only one team, then everybody is on the same team. Otherwise, for all possible choices of n-member teams, first see which people haven't yet been picked for a team. Then generate all remaining possible team arrangements for those leftovers, and combine them to form the final team rosters.

    The combination step is tricky because the recursive call generates subsets in the range [1, 2, 3, ..., n * (m-1)], and we need to convert those values into indices into the array of people waiting to be picked.

    Note that this algorithm over-counts the possibilities since it generates both [[1,2],[3,4]] and [[3,4],[1,2]]. In other words, it assumes that team order is important (say, because the first team will wear red jerseys and the second team will wear blue jerseys). In the original problem statement, the order of the teams is not significant. (Maybe we'll let them pick their own jersey colors.)

    To solve that, we impose a way of choosing one such arrangement as the one we enumerate, and ignore the rest. The natural way to do this is to select a representative player from each team in a predictable manner (say, the one whose name comes first alphabetically), and then arranging the representatives in a predictable manner (say, by sorting them alphabetically).

    The revised version of our algorithm goes like this:

    function Teams(n, m, f) {
     var a = [];
     for (var i = 1; i <= n * m; i++) {
      a.push(i);
     }
    
     if (m == 1) { f([a]); return; }
    
     a.shift();
     Subsets(n * m - 1, n - 1, function(s) {
      var firstTeam = [1].concat(s.map(function(i) { return i+1; }))
      var rest = a.filter(function(i) { return s.indexOf(i) < 0; });
      Teams(n, m - 1, function(t) {
        f([firstTeam].concat(t.map(function(team) {
          return team.map(function(i) { return rest[i-1]; });
         })));
      });
     });
    }
    
    Teams(2, 3, logToConsole);
    

    The first part of the function is the same as before, but the recursive step changes.

    We remove the first element from the array. That guy needs to belong to some team, and since he's the smallest-numbered guy, he will be nominated as the team representative of whatever team he ends up with, and since he's the smallest-numbered guy of all, he will also be the first team representative when they are placed in sorted order. So we pick him right up front.

    We then ask for his n - 1 teammates, and together they make up the first team. The combination is a little tricky because the Subsets function assumes that the underlying set is [1, 2, ..., n-1] but we actually want the subset to be of the form [2, 3, ..., n]; we fix that by adding 1 to each element of the subset.

    We then find all the people who have yet to be assigned to a team and recursively ask for m - 1 more teams to be generated from them. We then combine the first team with the recursively-generated teams. Again, since the recursively-generated teams are numbered starting from 1, we need to convert the returned subsets into the original values we saved away in the rest variable.

    Renumbering elements is turning into a bit of a bother, so let's tweak our original Subsets function. For example, we would prefer to pass the set explicitly rather than letting Subsets assume that the set is [1, 2, 3, ..., n], forcing us to convert the indices back to the original set members. It's also convenient if the callback also included the elements that are not in the subset.

    function NamedSubsets(a, k, f) {
     if (k == 0) { f([], a); return; }
     if (a.length == 0) { return; }
     var n = a[a.length - 1];
     var rest = a.slice(0, -1);
     NamedSubsets(rest, k, function(chosen, rejected) {
      f(chosen, rejected.concat(n));
     });
     NamedSubsets(rest, k-1, function(chosen, rejected) {
      f(chosen.concat(n), rejected);
     });
    }
    
    function takeAndLeave(chosen,rejected) {
     console.log("take " + chosen + ", leave " + rejected);
    }
    
    NamedSubsets(["alice", "bob", "charlie"], 2, takeAndLeave);
    

    The Named­Subsets function takes the last element from the source set and either rejects it (adds it to the "rejected" parameter) or accepts it (adds it to the "chosen" parameter).

    With the Named­Subsets variant, we can write the Teams function much more easily.

    function Teams(a, m, f) {
     var n = a.length / m;
     if (m == 1) { f([a]); return; }
    
     var p = a[0];
     NamedSubsets(a.slice(1), n - 1, function(teammates, rest) {
      var team = [p].concat(teammates);
      Teams(rest, m - 1, function(teams) {
        f([team].concat(teams));
      });
     });
    }
    
    Teams([1,2,3,4,5,6], 3, logToConsole);
    

    Assuming we're not in one of the base cases, we grab the first person p so he can be captain of the first team. We then ask Named­Subsets to generate his teammates and add them to p's team. We then recursively generate all the other teams from the people who haven't yet been picked, and our result is our first team plus the recursively-generated teams.

    There is a lot of potential for style points with the Named­Subsets function. For example, we can avoid generating temporary copies of the a array just to remove an element by instead passing slices (an array and indices marking the start and end of the elements we care about).

    function NamedSubsetsSlice(a, begin, end, k, f) {
     if (k == 0) { f([], a.slice(begin, end)); return; }
     if (begin == end) { return; }
     var n = a[end - 1];
     NamedSubsetsSlice(a, begin, end - 1, k, function(chosen, rejected) {
      f(chosen, rejected.concat(n));
     });
     NamedSubsetsSlice(a, begin, end - 1, k-1, function(chosen, rejected) {
      f(chosen.concat(n), rejected);
     });
    }
    
    function NamedSubsets(a, k, f) {
     NamedSubsetsSlice(a, 0, a.length, k, f);
    }
    

    We could use an accumulator to avoid having to generate closures.

    function AccumulateNamedSubsets(a, begin, end, k, f, chosen, rejected) {
     if (k == 0) { f(chosen, rejected.concat(a.slice(begin, end))); return; }
     if (begin == end) { return; }
     var n = a[begin];
     AccumulateNamedSubsets(a, begin + 1, end, k-1, f, chosen.concat(n), rejected);
     AccumulateNamedSubsets(a, begin + 1, end, k, f, chosen, rejected.concat(n));
    }
    
    function NamedSubsetsSlice(a, begin, end, k, f) {
     AccumulateNamedSubsets(a, begin, end, k, f, [], []);
    }
    
    function NamedSubsets(a, k, f) {
     NamedSubsetsSlice(a, 0, a.length, k, f);
    }
    

    For bonus style points, I recurse on the start of the range rather than the beginning so that the results are in a prettier order.

    We can also get rid of the temporary accumulator objects by manipulating the accumulators destructively.

    function AccumulateNamedSubsets(a, begin, end, k, f, chosen, rejected) {
     if (k == 0) { f(chosen, rejected.concat(a.slice(begin, end))); return; }
     if (begin == end) { return; }
     var n = a[begin];
     chosen.push(n);
     AccumulateNamedSubsets(a, begin + 1, end, k-1, f, chosen, rejected);
     chosen.pop();
     rejected.push(n);
     AccumulateNamedSubsets(a, begin + 1, end, k, f, chosen, rejected);
     rejected.pop();
    }
    

    And then we can take advantage of the accumlator version to pre-select the first player when building teams.

    function Teams(a, m, f) {
     var n = a.length / m;
     if (m == 1) { f([a]); return; }
    
     AccumulateNamedSubsetsSlice(a, 1, a.length, n - 1, function(team, rest) {
      Teams(rest, m - 1, function(teams) {
        f([team].concat(teams));
      });
     }, [a[0]], []);
    }
    

    There is still a lot of potential for improvement here. For example, you can switch to the iterative version of Subsets to avoid the recursion on subset generation. You can use an accumulator in Teams to avoid generating closures. And if you are really clever, you can eliminate many more temporary arrays by reusing the elements in the various recursively-generated arrays by shuffling them around. But I've sort of lost interest in the puzzle by now, so I won't bother.

  • The Old New Thing

    Before claiming that a function doesn't work, you should check what you're passing to it and what it returns

    • 53 Comments

    Before claiming that a function doesn't work, you should check what you're passing to it and what it returns, because it may be that the function is behaving just fine and the problem is elsewhere.

    The Get­Current­DirectoryW function does not appear to support directories with Unicode characters in their names.

    wchar_t currentDirectory[MAX_PATH];
    GetCurrentDirectoryW(MAX_PATH, currentDirectory);
    wcout << currentDirectory << endl;
    

    The correct directory name is obtained if it contains only ASCII characters in its name, but it truncates the string at the first non-ASCII character.

    If you step through the code in the debugger, you'll see that the Get­Current­DirectoryW function is working just fine. The buffer is filled with the current directory, including the non-ASCII characters. The problem is that the wcout stream stops printing the directory name at the first non-ASCII characters. And that's because the default locale for wcout is the "C" locale, and the "C" locale is "the minimal environment for C translation." The "C" locale is useless for actual work involving, you know, locales. You will have to do some language-specific munging to get the characters to reach the screen in the format you want, the details of which are not the point of today's topic.

    In other words, the bug was not in the Get­Current­DirectoryW function. It was in what you did with the result of the Get­Current­DirectoryW function.

    Here's another example of thinking the problem is in a function when it isn't:

    The Set­Window­TextW function does not appear to support Unicode, despite its name.

    wstring line;
    wifstream file("test"); // this file is in Unicode
    getline(file, line);
    SetWindowTextW(hwnd, line.c_str());
    

    If you look at the line variable before you even get around to calling Set­Window­TextW, you'll see that it does not contain the text from your Unicode file. The problem is that the default wifstream reads the text as an 8-bit file, and then internally converts it (according to the lame "C" locale) to Unicode. If the original file is already Unicode, you're doing a double conversion and things don't go well. You then pass this incorrectly-converted string to Set­Window­TextW, which naturally displays something different from what you intended.

    Again, the point is not to delve into the intricacies of wifstream. The point is that the problem occurred even before you called Set­Window­TextW. The observed behavior, then, is simple a case of Garbage In, Garbage Out.

    Here's another example from a few years ago.

  • The Old New Thing

    What is the strange garbage-looking string in the "command" value of a static verb?

    • 14 Comments

    A customer from a major software vendor asked, "What is the significance of the command value that can be found under HKCR\⟨progid⟩\shell\open\command. It appears to be a copy of the default value, but with the program name replaced with apparent garbage. We've seen this both with Microsoft products as well as products by other companies. There is no mention of this value in the documentation on static verbs."

    Name Type Data
    (Default) REG_SZ "C:\Program Files\Contoso\CONTOSO.exe" /NOLOGO "%1"
    command REG_MULTI_SZ 34GY`{XL?{Y)2S($,PP>c=@0l{Ja0N8KUwy@4JdO /NOLOGO "%1"

    The customer didn't explain why they were interested in this particular registry value. Maybe they thought it was enabling some super magical powers, and they wanted to get in on that action. (If that was the case, then they failed to notice that the same command value also existed in the verb registration for their own program!)

    That strange garbage-looking string was placed there by Windows Installer (also known as MSI). It is the so-called Darwin descriptor that Windows Installer uses to figure out what program to run when the verb is invoked by the shell. For compatibility with programs that read the registry directly (because everybody knows that reading the registry is much cooler than using the API), the default value is set to something approximating the local executable's path. That default value might be incorrect if the application has moved in the meantime, and it might be missing entirely if the application is marked as install-on-demand and has never been used, but at least it keeps those rogue programs working 99% of the time.

  • The Old New Thing

    If you want to be notified when your app is uninstalled, you can do that from your uninstaller

    • 28 Comments

    A customer had a rather strange request. "Is there a way to be notified when the user uninstalls any program from Programs and Features (formerly known as Add and Remove Programs)?"

    They didn't explain what they wanted to do this for, and we immediately got suspicious. It sounds like the customer is trying to do something user-hostile, like seeing that a user uninstalled a program and immediately reinstalling it. (Sort of the reverse of force-uninstalling all your competitors.)

    The customer failed to take into account that there are many ways of uninstalling an application that do not involve navigating to the Programs and Features control panel. Therefore, any solution that monitors the activities of Programs and Features may not actually solve the customer's problem.

    The customer liaison went back to the customer to get more information about their problem scenario, and the response was, that the customer is developing something like an App Lending Library. The user goes to the Lending Library and installs an application. They want a way to figure out when the user uninstalls the application so that the software can be "checked back in" to the library (available for somebody else to use).

    The customer was asking for a question far harder than what they needed. They didn't need to be notified if the user uninstalled any application from the Programs and Features control panel. They merely needed to be notified if the user uninstalled one of their own applications from the Programs and Features control panel.

    And that is much easier to solve.

    After all, when an application is installed, it registers a command line to execute when the user clicks the Uninstall button. You can set that command line to do anything you want. For example, you can set it to

    Uninstall­String = "C:\Program Files\Contoso Lending Library\CheckIn.exe" ⟨identification⟩

    where ⟨identification⟩ is something that the Check­In program can use to know what program is being uninstalled, so that it can launch the real uninstaller and update the central database.

Page 7 of 434 (4,335 items) «56789»