# January, 2012

• #### Don't try to allocate memory until there is only x% free

I have an ongoing conflict with my in-laws. Their concept of the correct amount of food to have in the refrigerator is "more than will comfortably fit." Whenever they come to visit (which is quite often), they make sure to bring enough food so that my refrigerator bursts at the seams, with vegetables and eggs and other foodstuffs crammed into every available nook and cranny. If I'm lucky, the amount of food manages to get down to "only slightly overfull" before their next visit. And the problem isn't restricted to the refrigerator. I once cleared out some space in the garage, only to find that they decided to use that space to store more food. (Who knows, maybe one day I will return from an errand to find that my parking space has been filled with still more food while I was gone.)

Occasionally, a customer will ask for a way to design their program so it continues consuming RAM until there is only x% free. The idea is that their program should use RAM aggressively, while still leaving enough RAM available (x%) for other use. Unless you are designing a system where you are the only program running on the computer, this is a bad idea.

Consider what happens if two programs try to be "good programs" and leave x% of RAM available for other purposes. Let's call the programs Program 10 (which wants to keep 10% of the RAM free) Program 20 (which wants to keep 20% of the RAM free). For simplicity, let's suppose that they are the only two programs on the system.

Initially, the computer is not under memory pressure, so both programs can allocate all the memory they want without any hassle. But as time passes, the amount of free memory slowly decreases.

 Program 10 (20%) Free (60%) Program 20 (20%) Program 10 (30%) Free (40%) Program 20 (30%) Program 10 (40%) Free (20%) Program 20 (40%)

And then we hit a critical point: The amount of free memory drops below 20%.

 Program 10 (41%) Free (18%) Program 20 (41%)

At this point, Program 20 backs off in order to restore the amount of free memory back to 20%.

 Program 10 (41%) Free (20%) Program 20 (39%)

Now, each time Program 10 and Program 20 think about allocating more memory, Program 20 will say "Nope, I can't do that because it would send the amount of free memory below 20%." On the other hand, Program 10 will happily allocate some more memory since it sees that there's a whole 10% it can allocate before it needs to stop. And as soon as Program 10 allocates that memory, Program 20 will free some memory to bring the amount of free memory back up to 20%.

 Program 10 (42%) Free (19%) Program 20 (39%) Program 10 (42%) Free (20%) Program 20 (38%) Program 10 (43%) Free (19%) Program 20 (38%) Program 10 (43%) Free (20%) Program 20 (37%) Program 10 (44%) Free (19%) Program 20 (37%) Program 10 (44%) Free (20%) Program 20 (36%)

I think you see where this is going. Each time Program 10 allocates a little more memory, Program 20 frees the same amount of memory in order to get the total free memory back up to 20%. Eventually, we reach a situation like this:

 Program 10 (75%) Free (20%) P20 (5%)

Program 20 is now curled up in the corner of the computer in a fetal position. Program 10 meanwhile continues allocating memory, and Program 20, having shrunk as much as it can, is forced to just sit there and whimper.

 Program 10 (76%) Free (19%) P20 (5%) Program 10 (77%) Free (18%) P20 (5%) Program 10 (78%) Free (17%) P20 (5%) Program 10 (79%) Free (16%) P20 (5%) Program 10 (80%) Free (15%) P20 (5%) Program 10 (81%) Free (14%) P20 (5%) Program 10 (82%) Free (13%) P20 (5%) Program 10 (83%) Free (12%) P20 (5%) Program 10 (84%) Free (11%) P20 (5%) Program 10 (85%) Free (10%) P20 (5%)

Finally, Program 10 stops allocating memory since it has reached its own personal limit of not allocating the last 10% of the computer's RAM. But it's too little too late. Program 20 has already been forced into the corner, thrashing its brains out trying to survive on only 5% of the computer's memory.

It's sort of like when people from two different cultures with different concepts of personal space have a face-to-face conversation. The person from the not-so-close culture will try to back away in order to preserve the necessary distance, while the person from the closer-is-better culture will move forward in order to close the gap. Eventually, the person from the not-so-close culture will end up with his back against the wall anxiously looking for an escape route.

• #### Microspeak: Walls and ladders

"Walls and Ladders" is not a game. It's just a metaphor for a conflict in which one side wants to perform some action and the other side wants to prevent it. The defending side builds a wall, and the attacking side builds a taller ladder. In response, the defending side builds a taller wall, and the attacking side builds an even taller ladder. The result of this conflict is that the defending side constructs an ever-more-elaborate wall and the attacking side constructs a more-and-more complex ladder [link possible NSFW], both sides expending ridiculous amounts of resources and ultimately ending up back where they started.

There is a closely-related metaphor known as an arms race. In an arms race, each participant wants to be the most X, for some property X. An arms race tends to be all-attack, whereas wall-and-ladders tends to have one side attacking and the other defending.

Since many conflicts can be phrased either as an attack-attack scenario or an attack-defend scenario (some defenses may include counter-attacks), I tend to get the two confused. Notice, for example, that my arms race article contains mostly walls-and-ladders scenarios; for example, a case where one side wants to terminate a process and another wants to prevent it from being terminated. On the other hand, my wall and ladders example was really more of an arms race, with both sides wanting to take control of the screen.

Depending on which group you work with at Microsoft, you may find a preference for walls and ladders over arms race, probably due to the same sensitivity to military terms that led to the War Room being renamed Ship Room. (I seem to recall that there was a lawsuit that among other things alleged that the fact that a Microsoft project called its daily meeting room the War Room was proof of Microsoft's evil essence.)

• #### Cultural arbitrage: The food-related sucker bet

While I was at a group dinner at a Chinese restaurant, a whole fish was brought to our table. One of the other people at the table told a story of another time a whole fish was brought to the table.

He attended the wedding rehearsal dinner of a family member. The bride is Chinese, but the groom is not. (Or maybe it was the other way around. Doesn't matter to the story.) The dinner was banquet-style at a Chinese restaurant, and one of the many courses was a whole fish.

Two of the non-Chinese attendees marveled at the presence of an entire fish right there in front of them, head, tail, fins, and all. I guess they had up until then only been served fish that had already been filleted, or at least had the head cut off. One of them nudged my acquaintance and said, "We'll give you \$500 if you eat the eyeball."

These guys inadvertently created their own sucker bet.

For you see, eating the eyeball is common in many parts of Asia. In fact, whenever their family has fish, my nieces fight over who gets the honor of eating the eyeballs!

I don't know whether my acquaintance cheerfully accepted the bet or whether he explained that their bet was a poor choice to offer a Chinese person.

What food-related sucker bets exist in your culture? (I'm not talking about foods like chicken feet or tongue, which are clearly prepared and served to be eaten. I'm talking about things that an uninitiated person might consider to be a garnish or an inedible by-product, like shrimp heads.)

Update: I remind you that the question is not asking for foods which are served as dishes on their own.

• #### Why was there a font just for drawing symbols on buttons?

Using a font was a convenient way to have scalable graphics.

It's not like Windows could've used VML or SVG since they hadn't been invented yet. EMFs would have been overkill as well. Fonts were very convenient because the technology to render scalable fonts already existed and was well-established. It's always good to build on something that has been proven, and TrueType scalable font technology proved itself very nicely in Windows 3.1. TrueType has the added benefit of supporting hinting, allowing tweaks to the glyph outlines to be made for particular pixel sizes. (A feature not available in most vector drawing languages, but also a feature very important when rendering at small font sizes.)

• #### Keys duplicated from photo: Delayed reaction

There was a report some time ago that researchers have developed a way to duplicate keys given only a photograph. When I read this story, I was reminded of an incident that occurred to a colleague of mine.

He accidentally locked his keys in his car and called a locksmith. Frustratingly, the keys were sitting right there on the driver's seat. The locksmith arrived and assessed the situation. "Well, since you already paid for me to come all the way out here, how would you like a spare key?"

"Huh? What do you mean?"

The locksmith looked at the key on the driver's seat, studied it intently for a few seconds, then returned to his truck. A short while later, he returned with a freshly-cut key, which he inserted into the door lock.

The key worked.

• #### How do I print non-error messages during compilation?

Commenter Worf remarked, "My one wish is that `#warning` would be supported."

I always find it interesting when people say "I wish that Microsoft would stop following standards," since the `#warning` directive is nonstandard.

The Microsoft C/C++ compiler implements the feature in a method compatible with the standard, namely via a `#pragma` directive.

``` #pragma message("You really shouldn't be doing that.") ```

If you want to warn people away from deprecated functionality, you can use the `#pragma deprecated()` directive or the even more convenient (but more standards-troublesome) `__declspec(deprecated)` declaration specifier. The declaration specifier is much more convenient than the preprocessor directive because you can use it in a macro, and you can attach it to specific overloads of a function. (It's also more standards-troublesome because, while it is still permitted by the standard because it begins with a double-underscore, it is also not required to be ignored by compilers which do not understand it.)

In my experience, however, printing messages during compilation is of little consequence. Print all the messages during compilation as you want; nobody will read them. The only thing that gets attention is an actual warning or error. (And in many cases, only the error will get any attention at all.)

• #### Puzzling out the upsell-o-meter

As I noted before, many grocery stores in the United States have a printer next to the cash register which prints out coupons customized to your purchases. Here's a purchase and the accompanying coupon. What is the story behind this pairing?

Purchased: Diapers for newborn baby.
Coupon: Save 75 cents on ice cream.

Bonus chatter: While waiting in line, I read the warning label on the diapers. It went on for quite a bit, but one part triggered my "I wonder what lawsuit led to this warning" sensor: "Like most articles of clothing, XYZ brand diapers will burn if exposed to flame." Did somebody say, "Oh no, there's a fire, what will I do? I know, I'll smother the fire with my baby's diapered bottom!"

• #### Why does CreateEvent fail with ERROR_PATH_NOT_FOUND if I give it a name with a backslash?

A customer reported that the `Create­Event` function was failing with the unusual error code `ERROR_PATH_NOT_FOUND`:

```HANDLE h = CreateEvent(0, FALSE, TRUE, "willy\\wonka");
if (h == NULL) {
DWORD dwError = GetLastError(); // returns ERROR_PATH_NOT_FOUND
...
}
```

The customer continued, "The documentation for `Create­Event` says that the `lpName` parameter must not contain the backslash character. Clearly we are in error for having passed an illegal character, but why are we getting the strange error code? There is no file path involved. Right now, we've added `ERROR_PATH_NOT_FOUND` to our list of possible error codes, but we'd like an explanation of what the error means."

Okay, first of all, building a table of all known error codes is another compatibility problem waiting to happen. Suppose in the next version of Windows, a new error code is added, say, `ERROR_REJECTED_BY_SLASHDOT`. What will your program do when it gets this new error code?

Now back to the error code. There is no file path involved here, so why is there a path-not-found error?

Because it's not a file system path that failed. It's an object namespace path.

If a backslash appears in the name of a named object, it is treated as a namespace separator. (If there is no backslash, the name is interpreted as part of the Local namespace.) And the call fails with a path-not-found error since there is no namespace called `willy`, so the path traversal inside the object namespace fails.

The treatment of the backslash as a namespace separator is sort of alluded to in the very next sentence of the documentation: "For more information, see Kernel Object Namespaces." The following paragraph also expands upon this idea: "The object can be created in a private namespace. For more information, see Object Namespaces." The documentation sort of assumes you'll follow the links and learn more about those namespacey things, at which point you'll learn what that backslash in the object name really means (and why there is the rule about not allowing backslashes).

But here it is if you don't want to try to figure it out:

"If you put a backslash in the name, it is treated as a namespace separator, and if you don't know what a namespace is, then that's probably not what you want. So don't use backslashes unless you know what you're doing."

• #### What a steal: A house for only ten dollars!

When I was signing the papers for a house purchase many years ago, I noticed that the deed papers read

The Grantor «names of people selling the house»
for and in consideration of TEN DOLLARS AND OTHER GOOD AND VALUABLE CONSIDERATION
in hand paid, conveys and warrants to «me»
the following described real estate...

I noticed that I technically was buying the house for ten dollars.

The closing agent explained, "Well, ten dollars and other consideration. This is just a convention, so that the actual amount paid for the house doesn't go into the record." It also saves them from having to revise the document each time the price changes. (I don't buy the "doesn't go into the record" argument, because you can still look up the actual sale price from public records. But this back in the days before Zillow, where this sort of information was not available online and if you wanted to look up this information, you had to gasp physically visit the county records office.)

When I told this story to one of my friends who was also buying a house, he said, "Cool. I'm going to ask them to put down \$20 when they draw up the papers for my house."

Therefore, according to the title transfer records, he paid twice as much for his house than I did mine.

• #### How can I detect the language a run of text is written in?

A customer asked, "I have a Unicode string. I want to know what language that string is in. Is there a function that can give me this information? I am most interested in knowing whether it is written in an East Asian language."

The problem of determining the language in which a run of text is written is rather difficult. Many languages share the same script, or at least very similar scripts, so you can't just go based on which Unicode code point ranges appear in the string of text. (And what if the text contains words from multiple languages?) With heuristics and statistical analysis and a large enough sample, the confidence level increases, but reaching 100% confidence is difficult. I vaguely recall that there is a string of text which is a perfectly valid sentence in both Spanish and Portuguese, but with radically different meanings in the two languages!

The customer was unconvinced of the difficulty of this problem. "Language detection of a single Unicode character should work with 100% accuracy. After all, the operating system already has a function to do this. When I pass the run of text to GDI, it knows to use a Chinese font to render the Chinese characters and a Korean font to render the Korean characters."

The customer has fallen into the trap of confusing scripts with languages. The customer in this case is an East Asian company, so they have entered the linguistic world with a mindset that each language has its own unique script, since that is true for the languages in their part of the world.

It's actually kind of interesting seeing a different set of linguistic assumptions. Whereas companies in the United States assume that every language is like English, it appears that companies in East Asia assume that every language is like English, Japanese, Chinese, Korean, or Thai. In this company's world, the letter "A" is clearly English, since it never occurred to them that it might be German, Swedish, or French.

When GDI is asked to render a run of text, it looks for a font that can render each specific character, and once it finds such a font, it tries to keep using that font until it runs into a character which that font doesn't support, and then it begins a new search. You can see this effect when a non-Western character is inserted into a string when rendered on a system whose default code page is Western. GDI will switch to a font that supports the non-Western character, and it will keep using that font for the remainder of the string, even though the rest of the string uses just the letters A through Z. For example, the string might render like this: Dvořak. GDI switched to a different font to render the "ř" and remained in that font instead of returning to the original font for the "ak".

Anyway, the answer to the customer's question of language detection is to use the language detection capability of the Extended Linguistic Services.

If you are operating in the more constrained world of "I just want to know if it's Chinese/Japanese/Korean/Thai or isn't," then you could fall back to checking Unicode character ranges. If you see characters in the ranges dedicated to characters from those East Asian scripts, then you found text which is (at least partially) in one of those languages. Note, however, that this algorithm requires continual tweaking because the Unicode standard is a moving target. For example, the range of characters which can be used by East Asian languages expanded with the introduction of the Supplemental Ideographic Plane. You're probably best just letting somebody else worry about this, say, by asking `Get­String­Type­Ex` for `CT_CTYPE3` information, or using `Get­String­Scripts` (or its redistributable doppelgänger `Downlevel­Get­String­Scripts`) or simply by asking ELS to do everything.

Page 2 of 4 (34 items) 1234