January, 2012

  • The Old New Thing

    Why does it take Task Manager longer to appear when you start it from the Ctrl+Alt+Del dialog?

    • 39 Comments

    Amit was curious why it takes longer for Task Manager to appear when you start it from the Ctrl+Alt+Del dialog compared to launching it from the taskbar.

    Well, you can see the reason right there on the screen: You're launching it the long way around.

    If you launch Task Manager from the taskbar, Explorer just launches taskmgr.exe via the usual Create­Process mechanism, and Task Manager launches under the same credentials on the same desktop.

    On the other hand, when you use the secure attention sequence, the winlogon program receives the notification, switches to the secure desktop, and displays the Ctrl+Alt+Del dialog. When you select Task Manager from that dialog, it then has to launch taskmgr.exe, but it can't use the normal Create­Process because it's on the wrong desktop and it's running under the wrong security context. (Because winlogon runs as SYSTEM, as Task Manager will tell you.)

    Clearly, in order to get Task Manager running on your desktop with your credentials, winlogon needs to change its security context, change desktops, and then launch taskmgr.exe. The desktop switch is probably the slowest part, since it involves the video driver, and video drivers are not known for their blazingly fast mode changes.

    It's like asking why an international package takes longer to deliver than a domestic one. Because it's starting from further away, and it also has to go through customs.

  • The Old New Thing

    When DLL_PROCESS_DETACH tells you that the process is exiting, your best bet is just to return without doing anything

    • 52 Comments

    When the Dll­Main function receives a reason code of DLL_PROCESS_DETACH, the increasingly-inaccurately-named lpReserved parameter to is used to indicate whether the process is exiting.

    And if the process is exiting, then you should just return without doing anything.

    No, really.

    Don't worry about freeing memory; it will all go away when the process address space is destroyed. Don't worry about closing handles; handles are closed automatically when the process handle table is destroyed. Don't try to call into other DLLs, because those other DLLs may already have received their DLL_PROCESS_DETACH notifications, in which case they may behave erratically in the same way that a Delphi object behaves erratically if you try to use it after its destructor has run.

    The building is being demolished. Don't bother sweeping the floor and emptying the trash cans and erasing the whiteboards. And don't line up at the exit to the building so everybody can move their in/out magnet to out. All you're doing is making the demolition team wait for you to finish these pointless housecleaning tasks.

    Okay, if you have internal file buffers, you can write them out to the file handle. That's like remembering to take the last pieces of mail from the mailroom out to the mailbox. But don't bother closing the handle or freeing the buffer, in the same way you shouldn't bother updating the "mail last picked up on" sign or resetting the flags on all the mailboxes. And ideally, you would have flushed those buffers as part of your normal wind-down before calling Exit­Process, in the same way mailing those last few letters should have been taken care of before you called in the demolition team.

    I regularly use a program that doesn't follow this rule. The program allocates a lot of memory during the course of its life, and when I exit the program, it just sits there for several minutes, sometimes spinning at 100% CPU, sometimes churning the hard drive (sometimes both). When I break in with the debugger to see what's going on, I discover that the program isn't doing anything productive. It's just methodically freeing every last byte of memory it had allocated during its lifetime.

    If my computer wasn't under a lot of memory pressure, then most of the memory the program had allocated during its lifetime hasn't yet been paged out, so freeing every last drop of memory is a CPU-bound operation. On the other hand, if I had kicked off a build or done something else memory-intensive, then most of the memory the program had allocated during its lifetime has been paged out, which means that the program pages all that memory back in from the hard drive, just so it could call free on it. Sounds kind of spiteful, actually. "Come here so I can tell you to go away."

    All this anal-rententive memory management is pointless. The process is exiting. All that memory will be freed when the address space is destroyed. Stop wasting time and just exit already.

  • The Old New Thing

    Why wasn't the Windows 95 shell prototyped on Windows NT?

    • 21 Comments

    Carlos wonders why the Windows 95 shell was prototyped as 16-bit code running on the still-under-development 32-bit kernel, USER, and GDI as opposed to being prototyped as fully 32-bit code on Windows NT.

    There were a number of reasons, some good, some bad.

    One reason was that the Windows 95 shell was being developed by the Windows 95 team, which was an outgrowth of the Windows 3.1 team. That meant that they had Windows 3.1-class hardware. And the hardware requirements of Windows NT were significantly higher than the hardware requirements of Windows 3.1. Here's a table for comparison:

    Platform RAM
    Minimum Recommended
    Windows 3.1 2MB 4MB
    Windows NT 3.1 12MB 16MB
    Windows 95 4MB 8MB

    The Windows 3.1 team adhered to the principle that the team members' machines, as a general rule, were as powerful as the recommended hardware requirements. If you asked really nicely, you were permitted to exceed that, but not by too much, with one notable exception. Think of it as performance dogfood. If Windows was unusable on the stated recommended hardware requirements, the entire team felt it because that's what they were running. (When Windows 95 shipped, my primary machine was a 486/DX50 with 8MB of RAM. My test machine was a 386 with 4MB of RAM. The combined computing power and storage capacity of all the machines in my office is now exceeded by your cell phone.)

    Okay, so you just finished Windows 3.1, and all of the team members currently have 4MB machines, with a few lucky people that have a whopping 8MB of RAM. If you decided to do your prototype work on Windows NT, that would mean tripling the amount of memory in most of the computers just to meet the minimum requirements for Windows NT. And you can't say that "Well, you would have had to pay for all that RAM anyway," because look at that chart: Windows 95's final hardware requirements were still lower than Windows NT's minimum!

    Spending all that money to upgrade the computers for something that was just a temporary situation anyway seemed like a bad way of spending your hardware budget.

    From the software development side, prototyping the new shell on Windows NT was not practical because Windows 95 introduced a whole bunch of new features to Win32, features which didn't exist in Windows NT. Part of the goal of the prototype was to exercise these new features, things we take for granted nowadays like Register­Class­Ex and WM_MOVING and the Close button in the upper right corner. Those features didn't exist in Windows NT; if you wanted to develop the prototype on Windows NT, somebody would have had to port them and build a special "throwaway" version of Windows NT for the Windows 95 team to use. That means devoting some people to learning a whole new code base and development environment (and buying lots more hardware) to add features that they had no intention of shipping.

    It was much more cost-effective to do the prototyping of the Windows 95 shell on Windows 95. You could see if a design led to poor performance and deal with it before things went too far in the wrong direction. You could use those fancy new functions in kernel, USER, and GDI, which is great because that meant that you would start finding bugs in those fancy new functions, you would start finding design flaws in those fancy new functions. If the shell team needed a new feature from the window manager or the kernel, they could just ask for it, and then they could start using it and file bugs when it didn't work the way they wanted. All the effort was for real. Nothing was being thrown away except for the stuff inside #ifdef WIN16 blocks, which was kept to a minimum.

    All through the shell prototyping effort, the code was compiled both with and without #define WIN16, and as the kernel team worked on supporting 32-bit processes, they had this program sitting there waiting to go that they could try out. And not some dorky Hello world program but a real program that does interesting things. (They couldn't use any of the Windows NT built-in programs because those were Unicode-based, and Windows 95 didn't support Unicode.)

    Maybe those were bad reasons, but that was the thinking.

  • The Old New Thing

    Why was there a font just for drawing symbols on buttons?

    • 19 Comments

    Henke37 wonders why the Marlett font was introduced. Why use a font for drawing symbols on window buttons?

    Using a font was a convenient way to have scalable graphics.

    It's not like Windows could've used VML or SVG since they hadn't been invented yet. EMFs would have been overkill as well. Fonts were very convenient because the technology to render scalable fonts already existed and was well-established. It's always good to build on something that has been proven, and TrueType scalable font technology proved itself very nicely in Windows 3.1. TrueType has the added benefit of supporting hinting, allowing tweaks to the glyph outlines to be made for particular pixel sizes. (A feature not available in most vector drawing languages, but also a feature very important when rendering at small font sizes.)

  • The Old New Thing

    Don't try to allocate memory until there is only x% free

    • 52 Comments

    I have an ongoing conflict with my in-laws. Their concept of the correct amount of food to have in the refrigerator is "more than will comfortably fit." Whenever they come to visit (which is quite often), they make sure to bring enough food so that my refrigerator bursts at the seams, with vegetables and eggs and other foodstuffs crammed into every available nook and cranny. If I'm lucky, the amount of food manages to get down to "only slightly overfull" before their next visit. And the problem isn't restricted to the refrigerator. I once cleared out some space in the garage, only to find that they decided to use that space to store more food. (Who knows, maybe one day I will return from an errand to find that my parking space has been filled with still more food while I was gone.)

    Occasionally, a customer will ask for a way to design their program so it continues consuming RAM until there is only x% free. The idea is that their program should use RAM aggressively, while still leaving enough RAM available (x%) for other use. Unless you are designing a system where you are the only program running on the computer, this is a bad idea.

    Consider what happens if two programs try to be "good programs" and leave x% of RAM available for other purposes. Let's call the programs Program 10 (which wants to keep 10% of the RAM free) Program 20 (which wants to keep 20% of the RAM free). For simplicity, let's suppose that they are the only two programs on the system.

    Initially, the computer is not under memory pressure, so both programs can allocate all the memory they want without any hassle. But as time passes, the amount of free memory slowly decreases.

    Program 10 (20%) Free (60%) Program 20 (20%)
    Program 10 (30%) Free (40%) Program 20 (30%)
    Program 10 (40%) Free (20%) Program 20 (40%)

    And then we hit a critical point: The amount of free memory drops below 20%.

    Program 10 (41%) Free (18%) Program 20 (41%)

    At this point, Program 20 backs off in order to restore the amount of free memory back to 20%.

    Program 10 (41%) Free (20%) Program 20 (39%)

    Now, each time Program 10 and Program 20 think about allocating more memory, Program 20 will say "Nope, I can't do that because it would send the amount of free memory below 20%." On the other hand, Program 10 will happily allocate some more memory since it sees that there's a whole 10% it can allocate before it needs to stop. And as soon as Program 10 allocates that memory, Program 20 will free some memory to bring the amount of free memory back up to 20%.

    Program 10 (42%) Free (19%) Program 20 (39%)
    Program 10 (42%) Free (20%) Program 20 (38%)
    Program 10 (43%) Free (19%) Program 20 (38%)
    Program 10 (43%) Free (20%) Program 20 (37%)
    Program 10 (44%) Free (19%) Program 20 (37%)
    Program 10 (44%) Free (20%) Program 20 (36%)

    I think you see where this is going. Each time Program 10 allocates a little more memory, Program 20 frees the same amount of memory in order to get the total free memory back up to 20%. Eventually, we reach a situation like this:

    Program 10 (75%) Free (20%) P20 (5%)

    Program 20 is now curled up in the corner of the computer in a fetal position. Program 10 meanwhile continues allocating memory, and Program 20, having shrunk as much as it can, is forced to just sit there and whimper.

    Program 10 (76%) Free (19%) P20 (5%)
    Program 10 (77%) Free (18%) P20 (5%)
    Program 10 (78%) Free (17%) P20 (5%)
    Program 10 (79%) Free (16%) P20 (5%)
    Program 10 (80%) Free (15%) P20 (5%)
    Program 10 (81%) Free (14%) P20 (5%)
    Program 10 (82%) Free (13%) P20 (5%)
    Program 10 (83%) Free (12%) P20 (5%)
    Program 10 (84%) Free (11%) P20 (5%)
    Program 10 (85%) Free (10%) P20 (5%)

    Finally, Program 10 stops allocating memory since it has reached its own personal limit of not allocating the last 10% of the computer's RAM. But it's too little too late. Program 20 has already been forced into the corner, thrashing its brains out trying to survive on only 5% of the computer's memory.

    It's sort of like when people from two different cultures with different concepts of personal space have a face-to-face conversation. The person from the not-so-close culture will try to back away in order to preserve the necessary distance, while the person from the closer-is-better culture will move forward in order to close the gap. Eventually, the person from the not-so-close culture will end up with his back against the wall anxiously looking for an escape route.

  • The Old New Thing

    How do FILE_FLAG_SEQUENTIAL_SCAN and FILE_FLAG_RANDOM_ACCESS affect how the operating system treats my file?

    • 39 Comments

    There are two flags you can pass to the Create­File function to provide hints regarding your program's file access pattern. What happens if you pass either of them, or neither?

    Note that the following description is not contractual. It's just an explanation of the current heuristics (where "current" means "Windows 7"). These heuristics have changed at each version of Windows, so consider this information as a tip to help you choose an appropriate access pattern flag in your program, not a guarantee that the cache manager will behave in a specific way if you do a specific thing.

    If you pass the FILE_FLAG_SEQUENTIAL_SCAN flag, then the cache manager alters its behavior in two ways: First, the amount of prefetch is doubled compared to what it would have been if you hadn't passed the flag. Second, the cache manager marks as available for re-use those cache pages which lie entirely behind the current file pointer (assuming there are no other applications using the file). After all, by saying that you are accessing the file sequentially, you're promising that the file pointer will always move forward.

    At the opposite extreme is FILE_FLAG_RANDOM_ACCESS. In the random access case, the cache manager performs no prefetching, and it does not aggressively evict pages that lie behind the file pointer. Those pages (as well as the pages that lie ahead of the file pointer which you already read from or wrote to) will age out of the cache according to the usual most-recently-used policy, which means that heavy random reads against a file will not pollute the cache (the new pages will replace the old ones).

    In between is the case where you pass neither flag.

    If you pass neither flag, then the cache manager tries to detect your program's file access pattern. This is where things get weird.

    If you issue a read that begins where the previous read left off, then the cache manager performs some prefetching, but not as much as if you had passed FILE_FLAG_SEQUENTIAL_SCAN. If sequential access is detected, then pages behind the file pointer are also evicted from the cache. If you issue around six reads in a row, each of which begins where the previous one left off, then the cache manager switches to FILE_FLAG_SEQUENTIAL_SCAN behavior for your file, but once you issue a read that no longer begins where the previous read left off, the cache manager revokes your temporary FILE_FLAG_SEQUENTIAL_SCAN status.

    If your reads are not sequential, but they still follow a pattern where the file offset changes by the same amount between each operation (for example, you seek to position 100,000 and read some data, then seek to position 150,000 and read some data, then seek to position 200,000 and read some data), then the cache manager will use that pattern to predict the next read. In the above example, the cache manager will predict that your next read will begin at position 250,000. (This prediction works for decreasing offsets, too!) As with auto-detected sequential scans, the prediction stops as soon as you break the pattern.

    Since people like charts, here's a summary of the above in tabular form:

    Access pattern Prefetch Evict-behind
    Explicit random No No
    Explicit sequential Yes (2×) Yes
    Autodetected sequential Yes Yes
    Autodetected very sequential Yes (2×) Yes
    Autodetected linear Yes ?
    None No ?

    There are some question marks in the above table where I'm not sure exactly what the answer is.

    Note: These cache hints apply only if you use Read­File (or moral equivalents). Memory-mapped file access does not go through the cache manager, and consequently these cache hints have no effect.

  • The Old New Thing

    How can I detect the language a run of text is written in?

    • 25 Comments

    A customer asked, "I have a Unicode string. I want to know what language that string is in. Is there a function that can give me this information? I am most interested in knowing whether it is written in an East Asian language."

    The problem of determining the language in which a run of text is written is rather difficult. Many languages share the same script, or at least very similar scripts, so you can't just go based on which Unicode code point ranges appear in the string of text. (And what if the text contains words from multiple languages?) With heuristics and statistical analysis and a large enough sample, the confidence level increases, but reaching 100% confidence is difficult. I vaguely recall that there is a string of text which is a perfectly valid sentence in both Spanish and Portuguese, but with radically different meanings in the two languages!

    The customer was unconvinced of the difficulty of this problem. "Language detection of a single Unicode character should work with 100% accuracy. After all, the operating system already has a function to do this. When I pass the run of text to GDI, it knows to use a Chinese font to render the Chinese characters and a Korean font to render the Korean characters."

    The customer has fallen into the trap of confusing scripts with languages. The customer in this case is an East Asian company, so they have entered the linguistic world with a mindset that each language has its own unique script, since that is true for the languages in their part of the world.

    It's actually kind of interesting seeing a different set of linguistic assumptions. Whereas companies in the United States assume that every language is like English, it appears that companies in East Asia assume that every language is like English, Japanese, Chinese, Korean, or Thai. In this company's world, the letter "A" is clearly English, since it never occurred to them that it might be German, Swedish, or French.

    When GDI is asked to render a run of text, it looks for a font that can render each specific character, and once it finds such a font, it tries to keep using that font until it runs into a character which that font doesn't support, and then it begins a new search. You can see this effect when a non-Western character is inserted into a string when rendered on a system whose default code page is Western. GDI will switch to a font that supports the non-Western character, and it will keep using that font for the remainder of the string, even though the rest of the string uses just the letters A through Z. For example, the string might render like this: Dvořak. GDI switched to a different font to render the "ř" and remained in that font instead of returning to the original font for the "ak".

    Anyway, the answer to the customer's question of language detection is to use the language detection capability of the Extended Linguistic Services.

    If you are operating in the more constrained world of "I just want to know if it's Chinese/Japanese/Korean/Thai or isn't," then you could fall back to checking Unicode character ranges. If you see characters in the ranges dedicated to characters from those East Asian scripts, then you found text which is (at least partially) in one of those languages. Note, however, that this algorithm requires continual tweaking because the Unicode standard is a moving target. For example, the range of characters which can be used by East Asian languages expanded with the introduction of the Supplemental Ideographic Plane. You're probably best just letting somebody else worry about this, say, by asking Get­String­Type­Ex for CT_CTYPE3 information, or using Get­String­Scripts (or its redistributable doppelgänger Downlevel­Get­String­Scripts) or simply by asking ELS to do everything.

  • The Old New Thing

    Why do Microsoft customer records use the abbreviation "cx" for customer?

    • 40 Comments

    As is common in many industries, Microsoft customer service records employ abbreviations for many commonly-used words. In the travel industry, for example, pax is used as an abbreviation for passenger. The term appears to have spread to the hotel industry, even though people who stay at a hotel aren't technically passengers. (Well, unless you think that with the outrageous prices charged by the hotels, the people are being taken for a ride.)

    For a time, the standard abbreviation for customer in Microsoft's customer service records was cu. This changed, however, when it was pointed out to the people in charge of such things that cu is a swear word in Portuguese. The standard abbreviation was therefore changed to cx.

    If you're reading through old customer records and you know Portuguese and you see the word cu, please understand that we are not calling the customer a rude name.

    The person who introduced me to this abbreviation added, "I just spell out the word. It's not that much more work, and it's a lot easier to read."

    Some years ago, I was asked to review a technical book, and one of the items of feedback I returned was that the comments in the code fragments were full of mysterious abbreviations. "Sgnl evt before lv cs." I suggested that the words be spelled out or, if you really want to use abbreviations, at least have somewhere in the text where the abbreviations are explained.

    If I had wanted to demonstrate the social skills of a thermonuclear device, my feedback might have read "unls wrtg pzl bk, avd unxplnd n unnec abbvs."

  • The Old New Thing

    How do I disable the fault-tolerant heap?

    • 10 Comments

    A while back, I linked to a talk by Silviu Calinoiu on the fault-tolerant heap. But what if you don't want the fault-tolerant heap? For example, during program development, you probably want to disable the fault-tolerant heap for your program: If the program is crashing, then it should crash so you can debug it!

    Method 1 is to disable the fault-tolerant heap globally. While this prevents the fault-tolerant heap from auto-activating in the future, it does not go back and undo activations that were enabled in the past. In other words, you have to remember to do this before your application crashes for the first time.

    Therefore, you probably want to combine Method 1 with Method 2 on the same page, where it gives instructions on how to reset the list of applications for which the fault-tolerant heap is enabled.

    Mario Raccagni provides a third way of disabling the fault tolerant heap, this time for one specific process instead of globally. His explanation is in Italian, so you get to exercise your translation skills.

    tl;dr version: Go to the HKEY_LOCAL_MACHINE and HKEY_CURRENT_USER versions of Software\Microsoft\Windows NT\CurrentVersion\AppCompatFlags\Layers\your_application.exe and delete the Fault­Tolerant­Heap entry.

  • The Old New Thing

    Keys duplicated from photo: Delayed reaction

    • 30 Comments

    There was a report some time ago that researchers have developed a way to duplicate keys given only a photograph. When I read this story, I was reminded of an incident that occurred to a colleague of mine.

    He accidentally locked his keys in his car and called a locksmith. Frustratingly, the keys were sitting right there on the driver's seat. The locksmith arrived and assessed the situation. "Well, since you already paid for me to come all the way out here, how would you like a spare key?"

    "Huh? What do you mean?"

    The locksmith looked at the key on the driver's seat, studied it intently for a few seconds, then returned to his truck. A short while later, he returned with a freshly-cut key, which he inserted into the door lock.

    The key worked.

Page 1 of 4 (34 items) 1234