Holy cow, I wrote a book!
In Microspeak, fit is a predicate noun which is never used on its own but always comes with a modifying adjective. For something to be a good fit is for something to be appropriate or suitable for a particular situation. The opposite of a good fit is not a bad fit, because that's pejorative. Rather, something that is not a good fit is referred to as a poor fit.
The purpose of a previewer plug-in is to allow users to view the media without opening it. An image editing tool would not be a good fit for the previewing feature. (Alternatively, "would be a poor fit for the previewing feature.")
To be a good fit with a particular group is to mesh well with that group's existing practices and conventions.
The Datacenter Edition of the product is a poor fit for most small businesses.
The group in question need not consist of people.
The results are obtained incrementally, which makes it a good fit for IQueryable<T> and LINQ.
IQueryable<T>
Microsoft Human Resources loves to apply the concept of "fit" to people fitting into a job position.
It's often the case that when a question from a customer gets filtered through a customer liaison, some context gets lost. (I'm giving the customer the benefit of the doubt here and assuming that it's the customer liaison that removed the context rather than the customer who never provided it.) Consider the following request:
We would like to know more information about the method the shell uses to resolve shortcuts.
This is kind of a vague question. It's like asking "I'd like to know more about the anti-lock braking system in my car." There are any number of pieces of information that could be provided about the anti-lock braking system.
When we ask the customer, "Could you be more specific what type of information you are looking for?" the response is sometimes
We want to know everything.
This is not a helpful clarification. Do they want to start with Maxwell's Equations and build up from there?
As it happened, in the case of wanting more information about the method the shell uses to resolve shortcuts, they just wanted to know how to disable the search-based algorithm.
This sort of "ask for everything and figure it out later" phenomenon is quite common. I remember another customer who wanted to know "everything" about changing network passwords, and they wouldn't be any more specific than that, so we said, "Well, you can start with these documents, perhaps paying particular attention to this one, but if they tell us what they are going to be doing with the information, we can help steer them to the specific parts that will be most useful to them."
As it turned out, all the customer really wanted to know was "When users change their password, is the new password encrypted on the wire?"
Third example, and then I'll stop. Another customer wanted to know everything about how Explorer takes information from the file system and displays it in an Explorer window. After asking a series of questions, we eventually figured out that they in fact didn't want or need a walkthrough of the entire code path that puts results in the Explorer window. The customer simply wanted to know why two specific folders show up in their Explorer window with names that didn't match the file system name.
When you ask for more information, explain what you need the information for, or at least be more specific what kind of "more information" you need. That way, you save everybody lots of time. The people answering your question don't waste their time gathering information you don't need (and gathering that information can be quite time-consuming), and you don't waste your time sifting through all the information you don't want.
You might say that these people are employing the for-if anti-pattern:
foreach (document d in GetAllPossibleDocumentation()) { if (d.Topic == "password encryption on the wire") return d; }
A customer liaison reported that their customer wants to be able to access their machine without needing a password. They just want to be able to net use * \\machine\share and be able to access the files right away. I guess because passwords are confusing, easy to forget, and just get in the way. Anyway, the customer discovered that they could do so on Windows XP by going to the folder they want to share, going to the Sharing tab, then clicking on the If you understand the security risks but want to share files without running the wizard link,
net use * \\machine\share
and then on the Enable File Sharing dialog, clicking Just enable file sharing.
What the customer wanted to know was if there was a way they could automate this process.
My response to the customer liaison went like this:
Your customer has chosen to ignore not one but two security warnings. Furthermore, since they are looking for an automated way of doing this, it sounds like they intend on deploying this "feature" to all the computers in their organization. Maybe they just enjoy being part of a botnet? Your customer is basically saying "I wish my computer to have no network security." They should at least restrict access to authenticated users. But if they if they insist on having their corporate network turned into a spam farm, they can enable the Guest account and say that it can "Access this computer from the network." Congratulations, your computers will soon be filled with malware and porn.
That last sentence made it into some people's quotes file.
While wasting time doing valuable background research on my computer, I received the following suggestion:
For enchanced video quality, click here.
It's good to know that the typo that I first encountered in 1993 is still alive and kicking.
(And even though it's not important to the story, people will demand some sort of follow-up, so here it is: I submitted feedback to the vendor, who said that it was a known issue fixed in the next update.)
Amit was curious why it takes longer for Task Manager to appear when you start it from the Ctrl+Alt+Del dialog compared to launching it from the taskbar.
Well, you can see the reason right there on the screen: You're launching it the long way around.
If you launch Task Manager from the taskbar, Explorer just launches taskmgr.exe via the usual CreateProcess mechanism, and Task Manager launches under the same credentials on the same desktop.
taskmgr.exe
CreateProcess
On the other hand, when you use the secure attention sequence, the winlogon program receives the notification, switches to the secure desktop, and displays the Ctrl+Alt+Del dialog. When you select Task Manager from that dialog, it then has to launch taskmgr.exe, but it can't use the normal CreateProcess because it's on the wrong desktop and it's running under the wrong security context. (Because winlogon runs as SYSTEM, as Task Manager will tell you.)
winlogon
Clearly, in order to get Task Manager running on your desktop with your credentials, winlogon needs to change its security context, change desktops, and then launch taskmgr.exe. The desktop switch is probably the slowest part, since it involves the video driver, and video drivers are not known for their blazingly fast mode changes.
It's like asking why an international package takes longer to deliver than a domestic one. Because it's starting from further away, and it also has to go through customs.
As is common in many industries, Microsoft customer service records employ abbreviations for many commonly-used words. In the travel industry, for example, pax is used as an abbreviation for passenger. The term appears to have spread to the hotel industry, even though people who stay at a hotel aren't technically passengers. (Well, unless you think that with the outrageous prices charged by the hotels, the people are being taken for a ride.)
For a time, the standard abbreviation for customer in Microsoft's customer service records was cu. This changed, however, when it was pointed out to the people in charge of such things that cu is a swear word in Portuguese. The standard abbreviation was therefore changed to cx.
If you're reading through old customer records and you know Portuguese and you see the word cu, please understand that we are not calling the customer a rude name.
The person who introduced me to this abbreviation added, "I just spell out the word. It's not that much more work, and it's a lot easier to read."
Some years ago, I was asked to review a technical book, and one of the items of feedback I returned was that the comments in the code fragments were full of mysterious abbreviations. "Sgnl evt before lv cs." I suggested that the words be spelled out or, if you really want to use abbreviations, at least have somewhere in the text where the abbreviations are explained.
If I had wanted to demonstrate the social skills of a thermonuclear device, my feedback might have read "unls wrtg pzl bk, avd unxplnd n unnec abbvs."
A customer reported that the CreateEvent function was failing with the unusual error code ERROR_PATH_NOT_FOUND:
CreateEvent
ERROR_PATH_NOT_FOUND
HANDLE h = CreateEvent(0, FALSE, TRUE, "willy\\wonka"); if (h == NULL) { DWORD dwError = GetLastError(); // returns ERROR_PATH_NOT_FOUND ... }
The customer continued, "The documentation for CreateEvent says that the lpName parameter must not contain the backslash character. Clearly we are in error for having passed an illegal character, but why are we getting the strange error code? There is no file path involved. Right now, we've added ERROR_PATH_NOT_FOUND to our list of possible error codes, but we'd like an explanation of what the error means."
lpName
Okay, first of all, building a table of all known error codes is another compatibility problem waiting to happen. Suppose in the next version of Windows, a new error code is added, say, ERROR_REJECTED_BY_SLASHDOT. What will your program do when it gets this new error code?
ERROR_REJECTED_BY_SLASHDOT
Now back to the error code. There is no file path involved here, so why is there a path-not-found error?
Because it's not a file system path that failed. It's an object namespace path.
If a backslash appears in the name of a named object, it is treated as a namespace separator. (If there is no backslash, the name is interpreted as part of the Local namespace.) And the call fails with a path-not-found error since there is no namespace called willy, so the path traversal inside the object namespace fails.
willy
The treatment of the backslash as a namespace separator is sort of alluded to in the very next sentence of the documentation: "For more information, see Kernel Object Namespaces." The following paragraph also expands upon this idea: "The object can be created in a private namespace. For more information, see Object Namespaces." The documentation sort of assumes you'll follow the links and learn more about those namespacey things, at which point you'll learn what that backslash in the object name really means (and why there is the rule about not allowing backslashes).
But here it is if you don't want to try to figure it out:
"If you put a backslash in the name, it is treated as a namespace separator, and if you don't know what a namespace is, then that's probably not what you want. So don't use backslashes unless you know what you're doing."
A customer asked, "I have a Unicode string. I want to know what language that string is in. Is there a function that can give me this information? I am most interested in knowing whether it is written in an East Asian language."
The problem of determining the language in which a run of text is written is rather difficult. Many languages share the same script, or at least very similar scripts, so you can't just go based on which Unicode code point ranges appear in the string of text. (And what if the text contains words from multiple languages?) With heuristics and statistical analysis and a large enough sample, the confidence level increases, but reaching 100% confidence is difficult. I vaguely recall that there is a string of text which is a perfectly valid sentence in both Spanish and Portuguese, but with radically different meanings in the two languages!
The customer was unconvinced of the difficulty of this problem. "Language detection of a single Unicode character should work with 100% accuracy. After all, the operating system already has a function to do this. When I pass the run of text to GDI, it knows to use a Chinese font to render the Chinese characters and a Korean font to render the Korean characters."
The customer has fallen into the trap of confusing scripts with languages. The customer in this case is an East Asian company, so they have entered the linguistic world with a mindset that each language has its own unique script, since that is true for the languages in their part of the world.
It's actually kind of interesting seeing a different set of linguistic assumptions. Whereas companies in the United States assume that every language is like English, it appears that companies in East Asia assume that every language is like English, Japanese, Chinese, Korean, or Thai. In this company's world, the letter "A" is clearly English, since it never occurred to them that it might be German, Swedish, or French.
When GDI is asked to render a run of text, it looks for a font that can render each specific character, and once it finds such a font, it tries to keep using that font until it runs into a character which that font doesn't support, and then it begins a new search. You can see this effect when a non-Western character is inserted into a string when rendered on a system whose default code page is Western. GDI will switch to a font that supports the non-Western character, and it will keep using that font for the remainder of the string, even though the rest of the string uses just the letters A through Z. For example, the string might render like this: Dvořak. GDI switched to a different font to render the "ř" and remained in that font instead of returning to the original font for the "ak".
Anyway, the answer to the customer's question of language detection is to use the language detection capability of the Extended Linguistic Services.
If you are operating in the more constrained world of "I just want to know if it's Chinese/Japanese/Korean/Thai or isn't," then you could fall back to checking Unicode character ranges. If you see characters in the ranges dedicated to characters from those East Asian scripts, then you found text which is (at least partially) in one of those languages. Note, however, that this algorithm requires continual tweaking because the Unicode standard is a moving target. For example, the range of characters which can be used by East Asian languages expanded with the introduction of the Supplemental Ideographic Plane. You're probably best just letting somebody else worry about this, say, by asking GetStringTypeEx for CT_CTYPE3 information, or using GetStringScripts (or its redistributable doppelgänger DownlevelGetStringScripts) or simply by asking ELS to do everything.
GetStringTypeEx
CT_CTYPE3
GetStringScripts
DownlevelGetStringScripts
A customer wanted to know the internal file format of Visual SourceSafe databases. (That wasn't the actual question, but I've translated it into something equivalent but which requires less explanation.) They explained why they wanted this information:
We are doing some code engineering analysis on our project, so we need to extract data about every single commit to the project since its creation. Things like who did the commit, the number of lines of code changed, the time of day... We can then crank on all this data to determine things like What time of day are most bugs introduced? and possibly even try identify bug farms. Since our project is quite large, we found that generating all these queries against the database creates high load on the server. To reduce the load on the server, we'd like to just access the database files directly, but in order to do that, we need to know the file format.
Oh great, directly accessing a program's internal databases while they're live. What could possibly go wrong?
I proposed an alternative:
Take a recent backup of your project and mount it on a temporary server as read-only. Run your data collection scripts against the temporary server. This will spike the load on the temporary server, but who cares? You're the only person using the temporary server; the main server is unaffected. After you collect all your data from the temporary server, you can then perform a much smaller number of queries against the live server to get data on the commits that took place since the last backup.
Another round of the semi-annual link clearance.
And, as always, the obligatory plug for my column in TechNet Magazine: