Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
I am pretty sure that Miss Manners probably did not cover this one in her column.
Let us say that I am scooting around in one of my scooters, and I have to enter a building. Let us further say that the building has one of those handy automatic door openers, so opening the door and getting in should be easy.
Everything is going well, right?
As it turns out, the answer may be no. Because some person may see me heading into the building and that person may decide to help.
Unfortunately, that person will often end up doing something that is the opposite of helping -- unintentionally blocking the way, not having the door open soon enough to let me get through the second door, trying to open it while standing out of the way in a position where they are clearly stretching uncomfortably.
What do I do then?
I definitely don't want to discourage people from helping; wanting to help out is a noble goal, worthy of praise.
But at the same time, I do not want to see myself or others injured or my attempt to do something to be made harder by an attempt to help that (for lack of a better word) doesn't.
The generic "Thanks, I've got it" does not seem to do the trick; people answer with "it's no problem, really!" as they unknowingly proceed to make sure that a problem does, in fact, exist.
I have taken to holding back far enough that it is clear I am not heading in yet, and then (once the coast is clear) going in. But there was one occasion where I actually did end up needing help -- it is not a common situation, but there must be some way to handle this without making people feel bad for helping, right?
Lacking Miss Manners to draw on, does anyone have a thought on the best way to handle this kind of a situation?
Apparently, my existing home mortgage is very impressive.
Because starting in June of this year I have been getting a lot of e-mail claiming that it makes me eligible for a decreased rate (as of midnight 4/26!).
What makes my mortgage most impressive is that I never had to sign any papers to get it. And I never have to pay any money back. On the other hand, I never received any money, either.
I don't even have a home; I live in an apartment.
I won't get into how annoying spam is, I think we all know about that. But I did find one interesting fact about all these mails -- they all follow the same pattern. Here is how they all start:
And so on (there are many more, but you probably get the point).
Every one is different, and it makes you feel like there is some script that runs and creates these random mails that follow the same basic structure but plug in words, presumably to confound email spam detectors.
Of course, the link that each one offers also seems to be different, and I have never followed any of them (even if it were genuine, they could never beat my current mortgage!).
But what was weird was how I looked at it, from a linguistic standpoint.
The first one ever sent, I dismissed as spam since it referred to a mortgage that did not exist and the English seemed off to me.
But as they started piling up, I still knew it was spam, but I started to realize that knowledge of English would probably be required to be able to have more variations than anything short of a thesaurus could offer. Perhaps that comes from some script writer, but clearly someone has to have some idea of how the language works.
So it struck me as both brilliant and idiotic:
BRILLIANT because sooner or later they might get the syntax right, and really they are not trying to; they just want to avoid spm detection.
IDIOTIC because on the first day four of them. Only a fellow idiot would actually jump up to take the offer.
It makes me wonder why they bother -- do people fall for this nonsense?
Imagine if they used their power for good?
Back in February of this year, KJK::Hyperion asked:
Michael, do you happen to know what these console character attributes are _really_ for? COMMON_LVB_LEADING_BYTE COMMON_LVB_TRAILING_BYTE COMMON_LVB_GRID_HORIZONTAL COMMON_LVB_GRID_LVERTICAL COMMON_LVB_GRID_RVERTICAL COMMON_LVB_REVERSE_VIDEO COMMON_LVB_UNDERSCORE The only real information I've found is a KB article ( http://support.microsoft.com/default.aspx?scid=kb;en-us;145925 ) explaining that they won't be supported in Windows 95, but it doesn't say a lot about them, and no mention of "keisen ruled lines" anywhere else on the internet. Any idea?
And then just last month, Robert Hodge asked the same question:
Do you know of any way to get characters to take an underscore attribute when being used by a Win32 Console application? The MSDN documentation for SetConsoleTextAttribute and CHAR_INFO (etc.) talks about an attribute called COMMON_LVB_UNDERSCORE, which is used like a 'color' setting and is used to get characters underlined, but it supposedly is associated with DBCS. I have tried using it but without success. I am considering a fallback strategy of altering a font to include pre-underlined characters, but that would be a lot of work. Any ideas? Thanks.
Unfortunately, the documentation is correct -- this particular feature is only supported in CJK (Chinese, Japanese, Korean) contexts.
Rich support of text is something better suited to non-console applications. These attributes are not generally available. Sorry!
This post brought to you by "●" (U+25cf, a.k.a. BLACK CIRCLE)
People still may not all agree about what to call it, like I said back in Is it Hangul? or Hangeul? or Han'gŭl? or what?
But today is the day we can remember the actions of King Sejong the Great, the creator of Hangeul (maybe we should just say 한글 and not worry about imperfect transliteration systems).
(via Bill Poser on Language Log, in a post where some very good information about 한글 can be found)
Prior posts in this series:
Extending collation support in SQL Server and Jet, Part 0 (HISTORY)Extending collation support in SQL Server and Jet, Part 1 (the broad strokes)Extending collation support in SQL Server and Jet, Part 2 (generating sort keys)Extending collation support in SQL Server and Jet, Part 2.1 (is this on?)
I thought I would finish that class first, and then be able to move on to the next steps....
Now, if you remember in string.Compare is for sissies (not for people who want SQLCLR consistency) there are all of the flag values that change between SQL Server/Windows and the CompareOptions enumeration. We will use that information for a private method in our class and a special constructor to take the SQL Server flags values.
Here goes:
sealed public class CustomSqlCollation { // // Private members // private CultureInfo m_cultureInfo; private CompareInfo m_compareInfo; private CompareOptions m_options; private int m_flags; private string m_name; private int m_lcid;
private const int NORM_IGNORECASE = 0x00000001; private const int NORM_IGNOREKANATYPE = 0x00010000; private const int NORM_IGNORENONSPACE = 0x00000002; private const int NORM_IGNORESYMBOLS = 0x00000004; private const int NORM_IGNOREWIDTH = 0x00020000; private const int SORT_STRINGSORT = 0x00001000; private static CompareOptions CompareOptionsFromFlags(int flags) { CompareOptions options = CompareOptions.None; if ((flags & NORM_IGNORECASE) != 0) { options |= CompareOptions.IgnoreCase; } if ((flags & NORM_IGNOREKANATYPE) != 0) { options |= CompareOptions.IgnoreKanaType; } if ((flags & NORM_IGNORENONSPACE) != 0) { options |= CompareOptions.IgnoreNonSpace; } if ((flags & NORM_IGNORESYMBOLS) != 0) { options |= CompareOptions.IgnoreSymbols; } if ((flags & NORM_IGNOREWIDTH) != 0) { options |= CompareOptions.IgnoreWidth; } if ((flags & SORT_STRINGSORT) != 0) { options |= CompareOptions.StringSort; }
return options; } private static CompareOptions FlagsFromCompareOptions(CompareOptions options) { int flags = 0; if ((options & CompareOptions.IgnoreCase) != 0) { flags |= NORM_IGNORECASE; } if ((options & CompareOptions.IgnoreKanaType) != 0) { flags |= NORM_IGNOREKANATYPE; } if ((options & CompareOptions.IgnoreNonSpace) != 0) { flags |= NORM_IGNORENONSPACE; } if ((options & CompareOptions.IgnoreSymbols) != 0) { flags |= NORM_IGNORESYMBOLS; } if ((options & CompareOptions.IgnoreWidth) != 0) { flags |= NORM_IGNOREWIDTH; } if ((options & CompareOptions.StringSort) != 0) { flags |= SORT_STRINGSORT; }
return flags; }
// // Constructors // public CustomSqlCollation(string name, CompareOptions options) { this.m_options = options; this.m_flags = FlagsFromCompareOptions(options); this.m_cultureInfo = CultureInfo.GetCultureInfo(name, false); this.m_compareInfo = this.m_cultureInfo.CompareInfo; this.m_name = name; this.m_lcid = this.m_compareInfo.LCID; }
public CustomSqlCollation(int lcid, CompareOptions options) { this.m_options = options; this.m_flags = FlagsFromCompareOptions(options); this.m_cultureInfo = CultureInfo.GetCultureInfo(lcid, false); this.m_compareInfo = this.m_cultureInfo.CompareInfo; this.m_name = this.m_compareInfo.Name; this.m_lcid = lcid; }
public CustomSqlCollation(string name, int flags) { this.m_flags = flags; this.m_options = CompareOptionsFromFlags(flags); this.m_cultureInfo = CultureInfo.GetCultureInfo(name, false); this.m_compareInfo = this.m_cultureInfo.CompareInfo; this.m_name = name; this.m_lcid = this.m_compareInfo.LCID; }
public CustomSqlCollation(int lcid, int flags) { this.m_flags = flags; this.m_options = CompareOptionsFromFlags(flags); this.m_cultureInfo = CultureInfo.GetCultureInfo(lcid, false); this.m_compareInfo = this.m_cultureInfo.CompareInfo; this.m_name = this.m_compareInfo.Name; this.m_lcid = lcid; }
public int LCID { get { return this.m_lcid; } }
public string Name { get { return this.m_name; } }
// Method to return an index value public byte[] GetSortKey(string input) { return this.m_compreInfo.GetSortKey(input, this.m_options);
// // Compare overrides // public int Compare(string string1, string string2) { return this.m_compareInfo.Compare(string1, string2, this.m_options); }
public int Compare(string string1, int offset1, string string2, int offset2) { return this.m_compareInfo.Compare(string1, offset1, string2, offset2, this.m_options); }
public int Compare(string string1, int offset1, int length1, string string2, int offset2, int length2) { return this.m_compareInfo.Compare(string1, offset1, length1, string2, offset2, length2, this.m_options); }
public int IndexOf(string source, string value) { return this.m_compareInfo.IndexOf(source, value, this.m_options); }
public int IndexOf(string source, char value, int startIndex) { return this.m_compareInfo.IndexOf(source, value, startIndex, this.m_options); }
public int IndexOf(string source, string value, int startIndex) { return this.m_compareInfo.IndexOf(source, value, startIndex, this.m_options); }
public int IndexOf(string source, char value, int startIndex, int count) { return this.m_compareInfo.IndexOf(source, value, startIndex, count, this.m_options); }
public int IndexOf(string source, string value, int startIndex, int count) { return this.m_compareInfo.IndexOf(source, value, startIndex, count, this.m_options); }
// // LastIndexOf overrides // public int LastIndexOf(string source, char value) { return this.m_compareInfo.LastIndexOf(source, value, this.m_options); }
public int LastIndexOf(string source, string value) { return this.m_compareInfo.LastIndexOf(source, value, this.m_options); }
public int LastIndexOf(string source, char value, int startIndex) { return this.m_compareInfo.LastIndexOf(source, value, startIndex, this.m_options); }
public int LastIndexOf(string source, string value, int startIndex) { return this.m_compareInfo.LastIndexOf(source, value, startIndex, this.m_options); }
public int LastIndexOf(string source, char value, int startIndex, int count) { return this.m_compareInfo.LastIndexOf(source, value, startIndex, count, this.m_options); }
public int LastIndexOf(string source, string value, int startIndex, int count) { return this.m_compareInfo.LastIndexOf(source, value, startIndex, count, this.m_options); }
// // IsPrefix method // public int IsPrefix(string source, string prefix) { return this.m_compareInfo.IsPrefix(source, prefix, this.m_options); }
// // IsSuffix method // public int IsSuffix(string source, string suffix) { return this.m_compareInfo.IsSuffix(source, suffix, this.m_options); }
}
Some random notes on the class:
For now, I have left off the overrides to take other CompareOptions choices within the various methods. This is something that can be added later without too much trouble if people need them.
I am also creating a class here that could mostly be an override of CompareInfo, but I am resisting doing that for the moment, since that adds other requirements that may or may not be worth taking on.
The constructors that take flags values will handle string sorts, which SQL Server does not support but Windows does. It is just a few extra lines of code so it is no big deal if it is included or not.
Finally, of course when writing the private CompareOptionsFromFlags and FlagsFromCompareOptions methods, I could have done something clever with the numbers that are different rather than just building up the new numbers fully, but this just seems more self-documenting for an operation that probably does not happen very often.
I'll write more in the series tomorrow or the day after....
This post brought to you by "Ǫ" (U+01ea, a.k.a. LATIN CAPITAL LETTER O WITH OGONEK)
Law & Order, a television show that is now in its 16th season, is one that I have pretty much watched since it first came on the air. Many people have theorized as to why the show has lasted as long as it has, but one of the reasons given in the show's own marketing is that cases are inspired directly from the headlines. This particular reason is the one I am going to focus on here.
Over the years, every time the final credits ran, we would see a quick disclaimer come up:
This story is fictional. No actualperson or event is depicted.
Sometimes, the beginning of the show would also include a similar disclaimer, prior to the first scene:
The following story isfictional and does not depictany actual person or event.
Less often, in some cases they would be a bit more explicit in what they were trying to have a disclaimer for:
Although inspired by actual events,the following story is fictionaland not intended to depictany actual person or event.
And then in the single most graphically obvious example of a disclaimer being added, the first season episode entitled Indifference had the following disclaimer at the end:
Although some aspects of this story mayremind you of the Lisa Steinberg case recentlyadjudicated in New York City, this episode andits characters are fictional and the events andactions portrayed do not reflect the actions ofany principals involved in that case. In theactual case, the male defendant was convictedof manslaughter, while all charges against hisfemale companion were dismissed. There wasno evidence of her involvement in physicalabuse of any child, or that any child wassexually abused by either adult.
To my knowledge, this is the only time that the original inspiration of a case was explicitly called out.
If you look at the various disclaimers used and especially the extreme one, the pattern becomes obvious. It is not generally speaking the accurate depiction of facts that troubles the lawyers of the show. It is that distance between the facts proven in a New York City court and those depicted in a fictional dramatization that tends to make them worry. In other words, it is the very differences that are intentionally introduced that make them the most nervous that people will assume they are trying to make statements about the original case.
In Indifference, the writers took the Lisa Steinberg case and as usual changed many important details -- the lawyer in the NYC criminal courts (which would have made an interesting story when it came to trying a member of the District Attorney's office!) becomes a psychologist who is a Reichian therapist. What in real life was an illegal adoption is in this case a set of biological parents. And an interesting if somewhat frightening piece of the plot of the show has the woman being the one who is physically abusing the children, in a syndrome that causes it to make her feel empowered after she is herself abused by her husband (in other words "he hits her, and then she hits the chldren"). And then overlaid on top of all of that, the male actually does sexually abuse the daughter.
No wonder they wanted the extra disclaimer! Those extra differences that do distinguish the disturbing fiction from the deplorable fact are a good reason for the lawyers to be nervous about suggesting too much, and perhaps opening themselves up for a lawsuit based on the directions they took.
Even to this day, I wonder if the lawyers went far enough on this episode's disclaimer. When I first saw it back in 1990 I remember wondering whether the reason the charges were dropped against her was just for her testimony and whether the show might have some of the facts straight -- it seemed like too much "legalese" to think that there might not be something sinister, in my younger mind.
This gets us closer to the reason for having a disclaimer -- it is for when people might misunderstand, and then do the wrong thing with what is there.
Now it also can protect in the case where there is no misunderstanding. But that is usually not a cause for a lawsuit, if you know what I mean.
I do find it fascinating that people can market a show's plots as being "ripped from the headlines" and then think that a little text in the credits that many people will ignore can protect them. It probably was a big concern in the first season (thus they had that longer disclaimer narrated to be sure people would listen!).
Though by now the person suing would have to prove that the 16 years of the show is not relevant as proof that it is "just a show" so perhaps they are not as worried anymore....
Perhaps I should hold a contest to see who can come up with the most source inspirations for my disclaimer, with perhaps a special bonus for anyone who can think of the proper disclaimer text to keep people from misunderstanding the disclaimer? :-)
Yesterday, John Yunker posted about how there are More Reasons to Localize for the US Hispanic Market:
According to a recent study by Feedback Research of online consumers who lived in the US, spoke Spanish at home and/or used Spanish media regularly: 69% of Spanish-speakers who shop online preferred Spanish language sites when shopping or gathering information about products/services online. 49% of Spanish-speaking respondents who shop online stated that they were more likely to buy from a Spanish language site when shopping online.
According to a recent study by Feedback Research of online consumers who lived in the US, spoke Spanish at home and/or used Spanish media regularly:
I was thinking about those two numbers for a while after I saw the blog entry.
And I wondered whether those people who preferred Spanish had a preference about the "type" of Spanish. I do know that people from Spain often look at localized products as having too much Mexican influence, and I have even had occasion to hear the converse. And I know that French localizations are often seen as having either too much influence from France or from Quebec. The differences in spelling and word choice in English between the USA and a lot of the rest of the English speaking world are well-known. Or the fact that the Arabic speakers in Morocco may not even find the Arabic spoken in Saudi Arabic to be intellgible at times (and vice versa). You could probably apply this type of argument to almost any language spoken in more than one location.
Or perhaps even within a single location!
After all, under ideal circumstances, it is easy for a product to appear "local" when properly localized for a particular target market. But that implies that all the members of the target market speak the exact same language, something that we know to be false.
Which is not to say that software companies (and I am including Microsoft here) seem to be targetting multiple flavors or dialects of a language in a single location (or even multiple locations). The goal is always to lower the cost of localization, and when that is the goal then with a limited number of exceptions it is hard to justify localizing into the same language more than once.
It is not true of all cases but isn't there a class aspect related to dialect? aren't we creating a class of people who can use computers?
And are we doing people a service if we work that hard to encourage them use their language in a way that is not comfortable for them?
Solutions to such a problem are indeed elusive. It is easy to imagine that if you really did target the US Hispanic market in a Spanish localization that you would produce a product that would be more reaily accepted than one that felt like it was translated for someone else. But when people are willing to settle for whatever is available, I doubt we could do much more than imagine. in any case.
And a model based on communities of people who can contribute their preferences for usage seems farfetched, since most people are not scientifically extracting the their language usage, as they are too busy using it. There really had to be that force that get something out of it, and believes that there is a good reason to do the research and spend the monry to make it happen. Since we are all willing to deal with the one language we are given, how can we expect anything more?
This post brought to you by "ñ" (U+00f1, LATIN SMALL LETTER N WITH TILDE)
Yesterday, Heath Stewart posted about MSI Databases and Code Pages and it took me right back to when I was working on MSKLC.
You see, we wanted a nice easy setup that would install the keyboards people would create without making them buy a product.
I figured after all of the experience writing complex ACME setup scripts for the Office 97 Developer Edition Setup Wizard that I would be able to write a simple setup that has an easy job to do. And indeed it was -- I used the MSI API functions and managed to write something that would do the setups. Of course people who have Orca installed would regularly complain about the fact that it fails validation, but that is mainly because it is missing a validation table (of the 20 people who have mentioned the validation problem to me, only one person indicated that they noticed this).
Ok, sorry I forgot I meant to be talking about something else here!
Where was I? Oh yes, MSI databases and code pages.
Now MSKLC lets you create a keyboard description that can have in it any character in Unicode (minus a few things like the single quote character), so I ran headlong into the fact that the Windows Installer does not support Unicode.
Heath, that reminds me -- when is the Windows Installer going to support Unicode natively? And I don't mean UTF-7/UTF-8!
Until then, Heath's post about using code pages is a great reference to how to best handle the situation. Though I wish that best practices did not involve keeping everything in ASCII.
Now we just have to get them to support Unicode....
This post brought to you by "¬" (U+00ac, a.k.a. NOT SIGN)
With Visual Studio 2005 almost completely done, I figured it was time to send the updated instructions. You can find them right here.
For prior versions, see this article (note that I added the updated link).
Be sure to also read Rebuilding MFC/CRT and redistributing? for some of the legal issues.
Enjoy!
A few months ago, someone emailed me about some trouble they were having with their Arabic keyboard that had created with MSKLC. They were confused about the fact that none of the letters seemedd to ever shape. I asked them to send me the .KLC file so I could take a look. I went ahead and loaded it up in MSKLC on my machine:
It looked okay, although frankly it looked pretty similar to the Arabic 101 keyboard that has been in Windows.
But then I ran a tool that showed the code points instead of the characters:
Yikes! They were using characters in the Arabic Presentation Forms (A and B). Suddenly it became clear.
You see, Arabic is a language that shapes. So let us take U+0628 (ARABIC LETTER BEH). By itself, it looks like this:
ب
But things change when you combine it. Let us say it is at the beginning of the word, followed by U+062a (ARABIC LETTER TEH):
بت
That character on the far right is the BEH -- see how it looks different now?
Ok, let us say that its surrounded by two different letters, say preceeded by U+062e (ARABIC LETTER KHAH) and followed by U+062a (ARABIC LETTER TEH):
خبت
See that BEH in the middle there?
And now to round things out, let us say that the BEH is at the end, say after U+062e (ARABIC LETTER KHAH):
خب
Well, these four forms are known as the ISOLATED, INITIAL, MEDIAL, and FINAL forms.
Now back in the days before fonts were smart enough to do this sort of shaping, many legacy standards were built by actually encoding very possible form for each letter, thus:
ﺏ U+fe8f (ARABIC LETTER BEH ISOLATED FORM)
ﺑ U+fe91 (ARABIC LETTER BEH INITIAL FORM)
ﺒ U+fe92 (ARABIC LETTER BEH MEDIAL FORM)
ﺐ U+fe90 (ARABIC LETTER BEH FINAL FORM)
By combining the correct form with the correct form of KHAH or TEH you can make something look right, sometimes (other times the way they shape will cause these presentation forms to look not quite right). Combining that problem with the need for not quite four times the number of letters, and to train people to type the correct letter depending on where it is in the word, and it is just a nightmare.
If you do not know the language this can be hard to conceptualize. So let us try doing with the Latin script, using cursive writing. It is even more complicated, due to the wide variety of attatchment points.
b
ob
oba
ba
and so on -- to support English alone in such cases would easily require 1000 or more glyphs, and you would have a really hard time writing without constantly picking the right letters.
So, the rule with presentation forms is that they are "pre-shaped" and thus do not need to be shaped again. But, like Latin, the exact attachment points may not always be the same, so it is best to use the real Arabic letter rather than the presentation forms. It will save you from needing to remember every form before you type a letter.... because as the title says, it cannot always pay to be compatible -- especially when one is being compatible with a hard to use legacy standard....
This post brought to you by "ب" (U+0628, a.k.a. ARABIC LETTER BEH)
In the past week, I have had three different people ask me about MSKLC and x64 support, the most recent from a colleague just down the hall from me. I figured it might be time to explain a bit further than I did in this post. So here goes....
Let us stipulate for a moment that 64-bit computing is indeed where we are all heading, generally.
(I am not disagreeing with the idea, I am just trying to house my argument in correct legal language!)
So what is up with MSKLC?
Well, currently is does not support either IA64 or X64 processors. A fact that has more to do with some unrelated factors about the core purpose of MSKLC when it was conceived, which was to "open it all up" and "get out of the way" when it came to language enablement on Windows.
Although being able to create new keyboards for accessibility reasons (for example) was made possible, that is merely a side effect of the two disciplines having overlapping requirements. The similarities were not enough for the accessibility team to create an On-Screen Keyboard that would fit all of the languages we have built in, or enough for the accessibility problems in MSKLC that Sara Ford pointed out to me to have been fixed before we shipped (don't worry, I am on them for next version!).
In summary, you end up with differences.
At the time that MSKLC came out, AMD64 compilers were just getting usable, nobody even had the term x64, and even though there was an IA64 client space, it was already squarely aimed at the server space as its main target. Were there people with vision? Of course! But that vision was not as broadly spread to everyone just yet. So when you look at our core MSKLC purpose, it just did not mesh at that time.
Of course, times change, and the landscpe for x64 is very different now than it was a few years ago. Which makes the wider scenario more potentially interesting.
Now what goes into the change? Pretty much two things:
"What is so special about #2?" some may ask, especially when that directory has over 1,000 files in it on an x64 machine. But the truth is that there are only about 174 files (in the x64 Windows that just shipped) that are built differently than the regular x86 files, and of those 174 files, 123 of them are keyboard layout DLLs. The problem has to do with the info I started discussing in this post and never got back to, the fact that most keyboard layouts export one function and that function returns the raw data about the layout. On a 64-bit platform, the pointers must be wider than reular 32-bit pointers, so just like core files (kernel32.dll, user32.dll, etc.) special versions of the keyboard layout DLLs are needed. Therefore, for IA64 and x64 to be supported by MSKLC, jt must build both files and put them into the build.
I had to put down my cat Chelsea last night.
Those of you who read this blog may have been following her story. But, in the true spirit of Seattle weather predictions, her passing that was judged to be little more than 30 days from the diagnosis actually was delayed for nearly seven months.
Which is okay, since she never took direction very well from anyone.
At first over those months, she was actually gaining weight (all of that Fancy Feast, no doubt!), at one point making it up to what was probably a lifetime high of 10 pounds. But then even as she ate a lot, she started losing weight. And by the end she would still walk up to get the food but would not even be able to finish those small cans. Her weight eventually made it as far down as 4 pounds.
When I got home, she had one pupil dilated and one fully contracted, but neither reacted to light. Her temperature was down and although you could see she was breathing she had some reflex actions that indicated she as struggling for oxygen -- because blood was probably just not circulating well, at all.
I took her to the vet right away (they were open until 9pm), but it really was too late. At some point on her last day she must have thrown several clots and had a stroke, with multiple areas affected. If it were not Chelsea, I probably would have been fascinated about the medical aspects of it. As it was, I had a bit too much on my mind. After all, it was time to say goodbye.
But as Dr. Cottrille (Dr. Emery had gone for the day) looked for a vein it was clear that her blood pressure was so slow than injection would probably be ineffectual. The only way to euthanize would be to inject it directly into the liver.
Bracing me for the worst, she warned me that it could take up to 20 minutes for it to act. But seconds later both pupils were fully dilated, and she was gone.
So my plea to the cat-goddess Bubastis? Well, it would be that my dearest Chelsea Antoinette can find happiness now. Bog knows that the delightful spirit of a cat who did no wrong deserves that much....
(for Chelsea)
Recently someone asked:
Can anyone point me to the document with list of allowed characters for AD username (W2K and W2K3)? I am also looking for document which describes behavior that some characters are replaced during logon process. Example: If my username is ddomjanovic I am also able to login with username ddomjanović. So it looks like ć (codepage 1250, E6 = U+0107 : LATIN SMALL LETTER C WITH ACUTE) is replaced with c (63 = U+0063 : LATIN SMALL LETTER C) during logon process. Can this behavior be disabled?
Can anyone point me to the document with list of allowed characters for AD username (W2K and W2K3)?
I am also looking for document which describes behavior that some characters are replaced during logon process.
Example:
If my username is ddomjanovic I am also able to login with username ddomjanović. So it looks like ć (codepage 1250, E6 = U+0107 : LATIN SMALL LETTER C WITH ACUTE) is replaced with c (63 = U+0063 : LATIN SMALL LETTER C) during logon process.
Can this behavior be disabled?
I sort of answered that question in this post, but in a roundabout way. The short answer is No, there is no way to disable that behavior. The reason is that Active Directory passes the following flags:
NORM_IGNORECASE | NORM_IGNORENONSPACE | NORM_IGNOREWIDTH | NORM_IGNOREKANA
which means that there are many distinctions like this that are folded together.
Now as that other post stated, local accounts do not work through AD, so they take a more literal stand on things. You know, that whole "UpCase and Binary" thing that not only consider c (U+0063) and ć (U+0107) to be different letters, but which also considers ć (U+0107) and ć (U+0063 U+0301) to be different ones, too. Since the latter pair looks alike, it is obviously a solution that to a lot of people will be worse than the original problem!
Now he did actually find the answer in the MS Knowledge Base (Windows logon behavior if your user name contains characters that have accents or other diacritical marks).
This article mentions an interesting factoid about the issue, however:
The USERNAME variable in Windows is set to use the exact user name that you type in the User name box in the Log on to Windows dialog box. If you log on and you do not type the diacritical marks that are contained in your user name, the USERNAME variable also does not contain the diacritical marks in your user name. Therefore, the value of the USERNAME variable is different from the user name that is stored in Active Directory. To work around this behavior, log on to Windows by typing your user name in user principal name (UPN) format. To do this, type the following in the User name box, where UserName is your user name and DomainName is the name of the domain: UserName@DomainName.com
The USERNAME variable in Windows is set to use the exact user name that you type in the User name box in the Log on to Windows dialog box. If you log on and you do not type the diacritical marks that are contained in your user name, the USERNAME variable also does not contain the diacritical marks in your user name. Therefore, the value of the USERNAME variable is different from the user name that is stored in Active Directory. To work around this behavior, log on to Windows by typing your user name in user principal name (UPN) format. To do this, type the following in the User name box, where UserName is your user name and DomainName is the name of the domain:
UserName@DomainName.com
It then references another article that talks about the issue further (USERNAME environment variable may differ from actual user name):
SYMPTOMS When you log on to a Windows 2000-based domain, it is possible to use a logon name that is similar to the one that is stored in Active Directory. This may cause problems because the USERNAME environment variable is set to the user name that you typed in the logon dialog box, not to the user name that is stored in Active Directory. If any logon scripts relying on this variable, they may run up with unpredictable results. RESOLUTION A possible workaround to avoid this problem is to log on by using the user principal name (UPN) format. Instead of typing the user name, password, and domain on separate lines, type the UPN logon string in the User Name box. The UPN format is: username@domain.com Or, you can write a small program or batch file that resets the USERNAME environment variable to the value you need (you can get the actual user name with the Whoami utility) and add it in the Startup group. STATUS Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.
SYMPTOMS
When you log on to a Windows 2000-based domain, it is possible to use a logon name that is similar to the one that is stored in Active Directory. This may cause problems because the USERNAME environment variable is set to the user name that you typed in the logon dialog box, not to the user name that is stored in Active Directory. If any logon scripts relying on this variable, they may run up with unpredictable results.
RESOLUTION
A possible workaround to avoid this problem is to log on by using the user principal name (UPN) format. Instead of typing the user name, password, and domain on separate lines, type the UPN logon string in the User Name box. The UPN format is:
username@domain.com
Or, you can write a small program or batch file that resets the USERNAME environment variable to the value you need (you can get the actual user name with the Whoami utility) and add it in the Startup group.
STATUS
Microsoft has confirmed that this is a problem in the Microsoft products that are listed at the beginning of this article.
Of course, the second article only claims to be a problem in Windows 2000 while the first claims to also be a problem in XP and Server 2003. But it is not clear from the article whether it is really referring to the two different issues and the USERNAME text in the first article was just a long digression. It would probably be better to leave in the reference and take out the extra text, to avoid the confusion about what is fixed and what is not....
Such issues are commonplace in the KB, an issue I'll talk about further another day.
This post brought to you by "Ć" (U+0106, LATIN CAPITAL LETTER C WITH ACUTE)
Every once in a while, it nice to be able to blog about something I am working on while I am working on it. And over the past few days I have been clearing out a bunch of small bugs in collation edge cases, the earliest of which was reported back in 2001.
It occurred to me that there is a pattern I can use to narrow down where in the code the problem might be before I even look at the code. This is a good thing not because I am lazy but because there is a lot of code there....
So I thought it would be nice to lay out a lot of these issues in as post.
If this does not seem interesting to you, then you can probably move on at this point. :-)
Anyway, without furth adieu, the LIST -- worded as if one had a specific problem with sorting results and wanted to figure out the cause:
You can probably think of some specific issues that each of these problems might imply....
Anyway, back to bugs!
This post brought to you by "∂" (U+2202, a.k.a. PARTIAL DIFFERENTIAL)
There are those who believe that me telling folks how to not step in it does not make too much sense. But I would argue that my misteps are not as bad as the one I am mentioning here tonight.
As reported in The Register, there are people in Taiwan who are not especially happy with Google at the moment, with its labeling of Taiwan as a province of China in Google Earth. And as is typical of The Register, they are of course making fun of the issue as they report it.
Are they all 12 years old over at the Register? The one advantage to stories that start there is that even when they end up on /. there is nowhere to go but up. If you know what I mean.
Of course all of those former Microsofties at Google should have perhaps paid more attention to issue with lines in the time zone map before they left; it might have saved them the pain of dealing with this issue, just as it did for the folks at MSN Virtual Earth.