Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
You might remember when I was talking about how Thinking about MUI is making me bipolar.
Well, I was given the news that the problem was investigated and verified, and some initial work went into determining what the fix might entail.
However, there was not a compelling customer scenario based on actual requirements that I could name. Which means that under the current secret squirrel criteria it is a tough sell to try and get that fix pushed into the product ASAP.
And I can understand that -- me trying to put together a sample for this blog, even if it at the request of customers and even though it found this problem, is not a business justification, not a customer scenario.
It is at best the result of a customer scenario.
And that is of course only true if we can assume the customers who were requesting a sample had actual business reasons behind wanting to get more information on MUI functions and if we can assume that they wanted info on functions like SetThreadPreferredUILanguages and if we can assume that their reasons involved wanting to use custom locales/custom cultures.
I can't disagree with the powers that be if they tell me that there are a lot of assumptions in that last paragraph. And that I would need a helluva lot more in the way of customer requests/requirements in hand to be able to attempt to justify getting the fix done sooner rather than later (for example in a service pack, as opposed to a future release).
So, for those who were looking for more information on MUI either after that post, after this one, or even before that, my request is to you.
If you are such a developer or architect and you were really looking into (or wanting to look into) using MUI and custom locales to try and provide a localized version outside the language list provided by Windows, then please let me know, either by commenting here or by sending me info via the contact link. Of if you are part of an internal group at Microsoft then just send me mail, you are eligible too!
I will go to the mat for this one if I can get enough ammunition from customers (inside and outside of Microsoft) to do so.
Help me, Obi Wans, you're my only hope....
This post brought to you by ហ (U+17a0, a.k.a. KHMER LETTER HA)
QTran asked via the Contact link:
Michael,Install the Vietnamese keyboard on XPSP2, and guess what it didn't work very well. The major stumbling block for me is when I have to enter digits. You see, I am a touch-typist and to type a number I have to press [AltGr]. Great, it kinda works for 1 to 5 but when trying 6-0, I can't press [AltGr] and the keys at the same time.I figure if I can use MSKLC to change the layout to what I get used to. Looks like it's not working either because MSKLC doesn't chain dead keys, then I came across your post about this in 2005/11/11. Definitely, it'll be great if it could.Looks like I am 1 year too late, but ... is there any chance that you'll revisit the issue ?Kind Regards
Well, revisiting is easy, but that won't mean it gets fixed or anything. :-)
The post in question is What to do with the Vietnamese keyboard on Windows?. Though for qtran's specific issue, using CTRL+ALT with the left hand might be a workaround to easily get at the 6/7/8/9/0 on the keyboard. Though I never mind when people want to build their own keyboards with MSKLC. :-)
Of course when I was talking about whether people had feedback about the Vietnamese keyboard, I was definitely thinking more about the language issues, specifically about the limitations surrounding the way characters with multiple diacritics are handled.
The fundamental problem is the limitation on chained dead keys in Windows.
The short version (for those who don't want to follow the link), is that there is only so much state that is held -- basically up to a single pending dead key. The only way, therefore, to support chained dead keys is to have a unique character that represents the two diacritics together. There is unfortunately no such character for most of the instances that would be needed for Vietnamese, which means there is no way to support the dead key model for Vietnamese.
So there are two workarounds:
1) Have a layout support the Unicode model (which is to type the base letter followed by the diacritics) -- this will create text in normalization form D.
2) Have a layout support a mixed model (which is to type a diacritic, followed by a base letter plus diacritic) -- this will create text that is in an interim state between Normalization Forms C and D.
Of the two, #1 seems like it would be more natural (and #2 seems like it would be non-intuitive for everyone). Though only someone who knows the language could say more about hos such a keyboard would flow conceptually with the way someone would think about letters....
It might even be worth putting a .KLC file together that would support Vietnamese this way so people could try it out and see if it worked for them....
This post brought to you by ễ (U+1ec5, a.k.a. LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE)
A common question I get relates to "how do I find a font for U+####[##]?" where the character code is whatever Unicode code unit they need to show.
Like just the other day, the question I got:
Hi Michael, when browsing for how to write the reversed C I found your post on the matter.http://blogs.msdn.com/michkap/archive/2006/08/10/694191.aspxI need to reproduce a funerary monument in my thesis and though I found the Unicode values, 2183, 2184 I can't seem to find a font that actually prints them. Do you have an idea of what I could use? Word on PC or MAC, use both.Preferably in the public domain = free... :) Sorry for bothering you. BTW. The gravestone reads like this:MEMORIÆJOANN. DAVID ACKERBLADSVECIQUIOBIIT ROMAE A. D. VI.KAL. FEBR.CI <REVERSED C> I<REVERSED C> CCCXIXNE VIRI INTER EXTEROSVARIA DOCTRINA ILLUSTRISSEPULCRUM SUORUM INCURIATITULO CARERETPOSITUM MDCCCXXIV
Hi Michael, when browsing for how to write the reversed C I found your post on the matter.
I have found in recent years that http://www.fileformat.info/ is a great place to easily look. If you go the page for either U+2183 or U+2184, for example, you will see a link near the top that talks about the fonts that support the character. And by the looks of things there is at least Code2000 from James Kass that is a very inexpensive shareware font....
On that page, there is also a link to the Custom Font Report tool which will give you the list of characters supported by a font on your machine (you can also use a program like Character Map to look for anything on the BMP or the Word Insert Symbol dialog in Word 2003 or later to find anything on or off the BMP in a font.
There is also Alan Wood's Unicode Resources, which can be very helpful here in such cases....
This post brought to you by ↄ (U+2184, a.k.a. LATIN SMALL LETTER REVERSED C)
The thing released today that I am going to talk about is not the coolest product that will be widely available starting today, January 30, 2007.
It will not be used by as many people.
It will not change as many lives.
It is not localized into as many (or frankly any) other languages.
It will not have as big of an impact on the various companies providing hardware and software for computers.
But I still think it is cool as hell, cool enough that I do not resent it being buried under whatever else is going to be released on this fine day.
What product am I blathering about? You probably know already.
It is the one that fixes this problem and this problem and this problem, not to mention some other important bugs and problems that have been reported here and elsewhere.
Microsoft Keyboard Layout Creator 1.4 has now been released!
The download page can be found right here. The download is a bit larger than (a little over twice the size of) its younger cousin (MSKLC 1.3), though if you are looking for 64-bit support (AMD64/x64 or IA64), or if you are looking for custom locale support on Vista, you'll probably consider it to be worth the .NET Framework 2.0 requirement.
If not, then you'll still find MSKLC 1.3 is available, and it can run just fine with version 1.0 or 1.1 of the .NET Framework.
Enjoy!
This post is sponsored by every freaking character in Unicode 5.0
It is an old story.
You have a word.
A word like, for example, transparency.
And it has a connotation that is somewhere between neutral (when used objectively, e.g. describing a window surface) and negative (when used subjectively, e.g. a soon-to-be ex-girlfriend's assessment of a guy).
Then suddenly it becomes a big keyword, picked up to describe how great a company is if you know more about its plans.
So now it is great that e.g. Microsoft is so transparent about things in so many of its blogs, it is bad that e.g. Google isn't in its own blogs. Suddenly it is how one gets the proverbial girl instead of causes her to dump you.
(Not to say that this is why anyone ever dumped me. They always had good reasons, you can ask them. They just had different one than this!)
But A. Skrobov (the one who wanted to know what ASMO stands for) asked over the weekend and again yesterday about:
the other question that I asked over the weekend, and the one which seems to have vanished tracelessly, was about the word "transparent" in relation to the Arabic font and codepage names. What does that word really mean, as in "Arabic Transparent" font, and in "transparent ASMO" codepage?
I asked a lot of people if they knew. Scooting around the building while waiting for builds to complete, I am sure people wondered where I was going to, moving from one person to the next.
Everyone knew it was for a blog post. I guess my own motives are pretty transparent these days, huh? :-)
Anyway, some people had brilliant guesses.
Some people had silly guesses.
Some people didn't have a freaking clue.
And at least a few people had guesses that were frankly kind of dumb (but there are only dumb guesses, there are no dumb people!).
But everyone knew someone else who would know the answer even though they did not know for sure. Thus Cathy Wissink was sure John McConnell would know this factoid. And Carolyn Parsons was sure Ali Basit would know. And so on.
Finally, Judy Safran-Aasen and Simon Daniels were pretty sure Paul Nelson would know.
He sort of did! According to Paul:
This is ancient history, but here it goes.Back in the days of Windows 3.1 (and the low resolution world) there was a problem when we made the Arabic fonts. They appeared too small. (problem sound familiar with complex scripts today?)Glyph Systems (actually Diane Collier did the work) was asked to scale up the size of the base letters for Arabic and Hebrew fonts so they would be more readable. Because the diacritic marks are not commonly used in daily text this was not too much of a problem as there is a low occurrence of clipping of the combining marks.The source of the outlines is the Monotype outlines for Simplified Arabic.
Though he did not know about the word transparent itself:
Transparent was the name given to the version of the font. Why? Probably some arbitrary choice based on some well thought out reason. There is nothing transparent about the font. It is just the name given.
Well, we have enough clues now, I think. Given that the font's intended purpose was to become more readable on those very low-resolution screens, transparent s a synonym of apparent, obvious, visible, understandable, and so on makes a lot of sense.
The code page coverage is not 100% analogous to the font's coverage, but perhaps it was back in the days of the original font.
Of course the Arabic Transparent font was dropped in Vista, and it is amusing to guess that this due to the Typography team not wanting to be using such a "buzzword" in a font name (though the real reason would be more along the lines that twenty years later the needs of Arabic typography are really not the same and the font was adding much and was actually missing a lot of the important shaping behavior seen in other fonts).
I think It is ironic and amusing that we "dumped" the font that was "transparent", given the usual professional love of the word (especially these days, where it has become a buzzword like innovation) and the sometimes negative personal connotation....
Maybe I just need to be thankful that no one is going dump me for my transparent motives vis-a-vis the blog? :-)
This post brought to you by ش (U+0634, a.k.a. ARABIC LETTER SHEEN)
Windows has some time zone information stored in its innards.
But Windows only lives in the now; it pays no attention to what was and what will be, even when it does know.
In the words of George Carlin, I think I'll repeat that since since it seems vaguely important.
Windows only lives in the now; it pays no attention to what was and what will be, even when it does know.
Isn't Windows just the little hedonist? :-)
Which is part of the answer to a question Benoit Houle posed via the Contact link (probably better for the Suggestion Box, but I won't quibble; no one reads my text about this anyway):
Hi, after installing patch KB928388, everything works fine for DST in the future but it does not seems to work in the past. It looks like the DTS suppose to start only in 2007 is applied to years prior to this date. So by using the DateTime.ToUniversalTime() function for 2006, we are off by one week in fall and 3 weeks in spring.you can try it with this small C# .NET code:Console.WriteLine("");Console.WriteLine("MARCH");DateTime dt = Convert.ToDateTime("2006-03-01");for (int i = 0; i < 60; i++){ dt = dt.AddDays(1); Console.WriteLine(dt.ToUniversalTime().ToString("yyyy-MM-dd HH:mm:ss"));}Console.WriteLine("");Console.WriteLine("NOVEMBER");dt = Convert.ToDateTime("2006-10-01");for (int i = 0; i < 60; i++){ dt = dt.AddDays(1); Console.WriteLine(dt.ToUniversalTime().ToString("yyyy-MM-dd HH:mm:ss"));}Notice how the DST should be applied on october 29th in 2006 and is applied on november 5thNOTE: I am in Canada Standard Eastern Time Zone.I talk to MS support people and they did not have an answer for me.Any solutions?Best regards Benoit Houle
Everything is behaving as designed, until and unless some future version adds the "feature" of historical time zone knowledge.
Now the .NET Framework is the one that attempts to move into the realm of using the date in question to determine how to evaluate it, as described in Raymond Chen's epic post Why Daylight Savings [sic] Time is nonintuitive, so technically the .NET Framework is the technology that is teasing here by making the promise of dynamic evaluation even though the historical data may not be there; Windows itself doesn't try to do that....
I guess the .NET Framework could be doing a slightly better job on Vista if it is using the new dynamic time zones I have mentioned previously. It might be worth giving that a try to see if it works (if not then someone should go up and report this as a bug so that it can be fixed in a future version!).
This post brought to you by ٭ (U+066d, a.k.a. ARABIC FIVE POINTED STAR)
So over the weekend, A. Skrobov asked in the Suggestion Box:
What does ASMO stand for in the names of the Arabic-ASMO codepages? I've been googling for a while and still couldn't find any organization of this name.
As far as I know, ASMO is an acronym for Arabic Standard Metrology Organization. Though I do not know if the organization still exists under that name or not (there is SASMO which is the Syrian organization but I couldn't find a web site for ASMO, myself.
Maybe somebody else knows what happened to ASMO and where it is today. Anyone?
This post brought to you by ا (U+0627, a.k.a. ARABIC LETTER ALEF)
Under discussion today are eight characters made famous in several prior posts:
Now obviously it takes an explicit act of typography to add letters to a font if they are not there. That is obvious and not at all what this post is about.
Instead, this post is about taking S/s (U+0053/U+0073) or T/t (U+0054/U+0074) and making use of them with U+0326 (COMBINING COMMA BELOW) and U+0327 (COMBINING CEDILLA).
It was back in April of last year that our good friend Cristi asked in the microsoft.public.win32.programmer.international newsgroup:
I tried to do the following Unicode character combination, using Times New Roman font: 0074 0327 (with no space between them) where 0074 is latin small letter t and 0327 is combining cedilla below. The expected displayed result should have been character t with cedilla below, but the actual displayed character is t with comma below. I know that in TNR font the glyph associated with U+0163 has comma below, but what has this to do (a distinct Unicode combined character) with separate base character + combining diacritical mark combination ? HOWEVER, the strangest thing is this: if the 0074 0327 combination is preceded by (let's say) 0061 0327 (that is the displayed character a with cedilla below), the displayed t with comma below becomes t with cedilla below !What's the mess ?I tried this in Wordpad (WinXP) and MS Word.Cristi
Now it isn't all that confusing if you think about it. After all, when rendering an attempt to find a glyph in the font that represents the composite (combined) form of the letter and diacritic is always attempted.
Trying to work to literally combine a diacritic to a base character at a specific defined "attachment" point a second best option.
And of course the attempt to shove the diacritic in without that knowledge a distant third that leads to problems like this one I talked about in Cyrillic that affects Bulgarian).
The behavior that Cristi reported was before most of Microsoft's shipping fonts included the newer "comma below" characters but certainly after U+0326 existed and long after the Romanians had made it clear they preferred the "comma below" form to the "cedilla below" form in their text.
The font is simply using clues to try decide which one to show and is trying to help Romanian documents look more Romanian, using surrounding clues in the text to try and find the best form to use. It is really just a somewhat sophisticated form of language detection via letter choice if you think about it.
That is actually kind of cool, in my opinion.
I really wish that this functionality existed in a callable form, but unfortunately it is not (MLang's encoding support includes a locale parameter but no sophisticated work to fill it in is used at present).
Of course this one particular occurrence of the feature is less important now in Vista where the support for the correct characters is there. But as a stealth feature that few people ever seemed to notice before, it is still pretty interesting, if you ask me. :-)
This post brought to you by ̧ (U+0327, a.k.a. COMBINING CEDILLA)
It's funny how sometimes I'll have a blog post on my list of posts to write that, once I get down to writing it, ends up very different than I originally imagined. And then other times, it is pretty much exactly as I had constructed in my mind, perhaps days or weeks or even months before.
This post is a much more like the former than the latter, as the post ended up being influenced by someone independently asking me a question about the topic. :-)
Martin asked me:
Windows Vista now supports four new Romanian characters through updated fonts and a new keyboard layout, see http://blogs.msdn.com/michkap/archive/2006/11/19/1104093.aspx for images and Unicode codes.Pre-Vista, Romanians had to make do with similar looking characters from the Turkish alphabet, namely capital and lower-case s and t with cedilla, U+015e, U+015f, U+0162, U+0163. There is a vast body of Romanian documents and online content with these characters.Now, say, a Romanian Vista users wants to search a webpage or desktop for “Brașov”, a place name. With the default Romanian Standard keyboard, they enter the string in the IE search or Windows Desktop search with the new spelling, s with comma, U+0219. However, the content they are searching was created with the old spelling, using s with cedilla, U+015f. The search will fin nothing. Technically this is by design, but for Romanians who deem these characters as two interchangeable representations of the same sound, this is a bug.
Now it is an interesting point, and one that really can give a person pause. After all, the Romanians have been objecting to use of the cedilla below characters in their language for about as long as people have been using it (and I still have to post that Every Character Has a Story post!), but even ignoring all that there is a serious legacy data issue to contend with, one that just adding it to fonts can't completely help with, no matter how cool stuff like this and this may be. Because it affects text processing as a whole.
Although with that said, the NLS Romanian collation tables on Vista will create the equivalences so that you will get the right results. Therefore:
ș (U+0219, LATIN SMALL LETTER S WITH COMMA BELOW) ≡ ş (U+015f, LATIN SMALL LETTER S WITH CEDILLA)
Ș (U+0218, LATIN CAPITAL LETTER S WITH COMMA BELOW) ≡ Ş (U+015e, LATIN CAPITAL LETTER S WITH CEDILLA)
ț (U+021b, LATIN SMALL LETTER T WITH COMMA BELOW) ≡ ţ (U+0163, LATIN SMALL LETTER T WITH CEDILLA)
Ț (U+021a, LATIN CAPITAL LETTER T WITH COMMA BELOW) ≡ Ţ (U+0162, LATIN CAPITAL LETTER T WITH CEDILLA)
In other words, people who use the Vista collation functions (CompareString, CompareStringEx, and so on) with either MAKELANGID(LANG_ROMANIAN, SUBLANG_DEFAULT) or ro-RO as appropriate will be able to get the right results here.
Of course as Martin's question points out, other Microsoft products may not fare as well if they do not call our functions or create the equivalences themselves. Which is going to lead to some confusion among customers, and really make us wish that everyone was calling us to lead to the most consistent experience.
I'll talk more about this tomorrow, and how the .NET Framework fares here....
This post brought to you by Ț (U+021a, a.k.a. LATIN CAPITAL LETTER T WITH COMMA BELOW)
The other day Larry pointed out to me an article over on Macworld with an interesting little tidbit:
And now for the answers to the three-point stunts. Macworld Editorial Director Jason Snell created this challenge: Mac OS X supports a language invented in the 19th century by a Polish ophthalmologist, a language invented in the 20th century for a sci-fi movie, and a language that formed in the 10th century on a Pacific island chain. After U.S. English, make these your second, third, and fourth preferences respectively for your Mac’s application menus, dialogs, and sorting. Answer: The three languages are Esperanto, Klingon, and Hawaiian and can be located by opening the International system preference, selecting the Language tab, and then clicking the Edit List button. Esperanto is easy enough to find but Klingon and Hawaiian aren’t as Klingon is spelled in the Klingon language (it’s the tlhlngan Hol entry) and Hawaiian is likewise presented in its native spelling. (You’ll find it just below Hrvatski.)
Now I will admit that having a language like Klingon built-in to the platform has a certain visceral appeal in terms of cool factor.
It didn't make me want to go out and by a Mac, but if someone had one I'd probably want to take a look. If for no other reason than to see if they really liked user locale and UI language into a single choice like that. Ugh! :-)
But I was actually reminded of something we found out with one of our Latin transliteration locales that was added in XP SP2, Inuktitut. The sorting data was added for it initially, but this was later discarded as it really wreaked havoc with all of the non-Inuktitut data. I suspect that the Latin transliteration of Klingon and it's sort might do the same. And since Klingon in its own script was rejected (I may have mentioned it in passing here or here), it seems like Klingon collation would probably be distracting....
But in the end, having the Klingon locale on the list of UI languages for a Mac does make me a bit jealous since it can't really be done in Windows just yet, even in Vista (as I pointed out in Thinking about MUI is making me bipolar). So it is not so much to do with wanting to select Klingon in the UI language list as it is wanting it to be in the list, you know?
I should probably find out if Shawn (who insists he is not a Klingon!) has a Masc. Or if he plans to get one now? :-)
Personally, I have found that with the exception of specific language communities, the custom locale that demos the best is the valley girl one, as it is the one that the most people are likely to be able to follow without having to know a language. Ones like Klingon are fun to point out but showing them off does not have as much appeal since until the Klingons invade, most people don't know Klingon.
Which doesn't mean I don't wish it could be ion my UI language list....
This post brought to you by ᗼ (U+15fc, a.k.a. CANADIAN SYLLABICS CARRIER KKO)
Legolas (no relation to the elf of the same name!) asked:
Hi, I'd like to suggest something that may be in line with your recent 'convert to unicode' posts, although it's very specific, perhaps too specific. Anyway: for a print monitor, the function you pass in the MONITOR2 structure as pfnStartDocPort, gets a DOC_INFO_1 or DOC_INFO_2 structure passed as a LPBYTE. the level parameter tells me wether it's 1 or 2, but what I currently can't find: how do I know if I'll get the A or the W version? I can see nothing to indicate my preference, and if I read the docs right users of printers can pass whatever they want. I'm guessing windows is converting to the one I need, but how does it know what I need?
The answer -- it is ONLY ever Unicode.
CLUE #1: You can tell by the prototype in the documentation for StartDocPort and in winsplp.h:
BOOL (WINAPI *pfnStartDocPort) ( HANDLE hPort, LPWSTR pPrinterName, DWORD JobId, DWORD Level, LPBYTE pDocInfo );
See that 2nd parameter? It's an LPWSTR.SO this function can only be getting Unicode.
CLUE #2: The DOC_INFO_1 and DOC_INFO_2 structures choose their Unicodality based on the function called that gets them.
Since NT-based platforms only support DOC_INFO_1 (according to the docs) combined with the fact that the function is only in newer versions (which feeds into the fact that ANSI is not being added so much anymore) kinda conspires to requiring it all to be Unicode.
So, to indirectly answer the bigger question that Legolas posed -- to find out if a function takes Unicode or ANSI in otherwise ambiguous cases, one probably just has to do some spelunking around in the SDK and the DDK and the header files. :-)
This post brought to you by ឈ (U+1788, a.k.a. KHMER LETTER CHO)
Over in the Suggestion Box, Mike (I think we all know which Mike that might be!) asks:
The IE7 RunOnce page which is loaded after IE7 is installed gives users control over anti-phishing and accept-language settings etc.For some reason it sets the radio buttons so that the English(US) is set as the default rather than the current language setting of English(UK) or English(Australia). This was bugged during the IE7 and Vista betas, repro'd by other users, and verified by the triagers .... yet it hasn't been fixed. What's going on?
This is a bug that I remember was reported more than once during the product cycle. Results sometimes varied a bit from build to build, and at one point the "RunOnce" page had a bunch of settings that weren't even trying to pick up either machine settings or settings from the post-setup "OOBE" component that asks questions as a part of the installation of Vista. So the thought was why have the page at all, I guess. Which make sense if the right setting was happening, though clearly it wasn't at least some of the time.
Especially when one considers that IE doesn't use the region but instead uses the default user locale, it is decidedly odd that it is so adamant about improperly using the setting here, isn't it?
And super especially when one considers the fact that IE moved away from region-neutral language choices and moved into the specific ones so wholeheartedly?
What the hell is going on here? Is this yet another example of the problem I have described in posts like Using full locales rather than the neutral ones?
Well, to some extent the answer to the preceding question is YES.
Though to be honest that is a pretty lame answer, and there has to be more to it. I mean, especially since neutral support on Windows does not exist, an application like IE has to go out of its way to support anything but a specific language, and has to go out of its way to turn around and support the wrong one....
To add insult to injury, at one point this very same guy Mike as reporting that the original bug of the settings being wrong was fixed BUT there was a new problem where that site which comes up once that lets you change settings was defaulting to the wrong thing. Which may still be the behavior now, I can't tell from the above if that is still the case or if the core, original setting that was finally correct managed to break again. What are they basing it on -- install language?
What exactly is going on here?
Well, one of two things.
Either there is something very unique to Mike's settings that few other people see (which I do not believe to be the case, especially since other people did validate his bug).
Or there are as bunch of people hitting problems here -- which seems a lot more likely.
They don't listen to me, though. And they clearly never managed to address a bug that was reported some time ago (I saw the bug, and it was resolved "not repro.").
Assuming the bug is indeed still around (as it seems to be), I'd ordinarily recommend that the lot of them head over to the IE Blog and start complaining. Though they don't seem to have a suggestion box and it is kind of rude to start randomly commenting in unrelated posts (I consider it rude here so obviously it's not something I'd call acceptable there). If you look at the blog there are lots of exciting posts about new localized releases being available, but not so much about core language support. Which is maybe another proof that it is broken (if it weren't, wouldn't they be blogging about the feature?).
Anyway, currently my Vista machines run IE7, though that is likely because I am reinstalling often (newer builds, other configurations). My XP/Server 2003 machines are in all but one case running IE 6. mainly out of spite over it being made a "critical update" which annoys me.
To be fair I'll try out IE here and see what it does (something I have done several times during the beta but nothing recently). I've got a bit of free time this weekend, I'll let you know how it goes. If it doesn't work out I may have to reassess my browser choice....
This post brought to you by പ (U+0d2a, a.k.a. MALAYALAM LETTER PA)
In response to Raymond's What('s) a character!, Miral commented:
This whole discussion is why I heartily wish that *all* WinAPIs, without exception, exclusively used a count of bytes and not characters or storage characters or whatever.I know, I know, no time machines. Doesn't stop me grumbling about it though :)
Well, let me state for the record that I am glad that Miral has no time machine!
I mean, let's consider what this would mean for applications that may or may not support Unicode (like that one I built just recently that we are shipping soon!). I mean, how on earth would the case of a byte count that is randomly doubled depending on which version of a function is going to be handled here?
Thoughts about Microsoft shipping out that mythical "MS Time" product to folks like Miral? Don't try to rock me to sleep with bedtime stories like that -- I can't believe people wonder why I am up posting at 3am so many nights? :-)
This post brought to you by 2 (U+0032, a.k.a. DIGIT TWO)
(Nothing technical in this post, whatsoever)
Just a quick update on the MS situation, for those who are curious. :-)
Well, first of all, I did get my driver's license after the retest (did better this time that I did the first time, technically!). I still think the state is discriminating (as I said before, though I know some disagree). For what it is worth, the person who set up the appointment for the test remembered me when I came back and actually went out with me for the test itself. She expressed some dismay over how the other location handled the whole thing, and since I passed she was happy to help me resolve the situation.
At least until 2011 when I get to maybe go through it all again? I'll be sure to go the same office next time, at least!
In other news, I have decided to stop the Novantrone treatments for the time being. They really have done as much good as they will for now. I'd rather save my remaining doses for the future if they are needed (since there is that whole 12 doses lifetime maximum thing I mentioned in my Napalm post).
Besides, it's hard to argue that voluntary infusion of a chemotherapy drug is not in some sense intentional self-destructive behavior. :-)
Novantrone has done a good job of acting as something of a RESET button on a lot of my MS symptoms, which has been a good thing. But st this point I think I'd rather hoard those last few infusions I am permitted for some possible future time when the reset button is needed again....
This post brought to you by 독 (U+b3c5, a.k.a. HANGUL SYLLABLE TIKEUT OKIYEOK)
Yesterday in response to When is a character not a character?, reader Bart commented:
Maybe you should write a post about how the concept of a character in the sense of wchar should be deprecated for uses other then datastorage or maybe a codepoint. And maybe explain how to handle the kind of characters this article is about and what sets them aside from 'normal' strings. (and maybe how to recognize them so that you can still do things like ReplaceStr)
Funny me, I thought the issues behind the first part of what he is talking about were kind of what I had been doing in all these posts like that one and this one and this one and a whole bunch of others. Hell, Raymond even covered this recently. :-)
Though I am not sure I agree that the best answer is to deprecate whole definitions here. I mean after all Definitions are context sensitive and since for most purposes programmers need to care about the old definition they have since it controls important aspects like memory allocation, maximum buffer sizes, and so on, telling them that their thinking is all wrong is just not such a good idea since for most part their thinking is spot on.
The fact remains that developers need to understand that taking one letter and adding a whole buttload of diacritics still gives you what is basically one character is some sense. One simply ought to be aware of these two definitions so that foolish things like doing one's own cursor movements in a control rather than letting Uniscribe data and so on do the work. Because sometimes ligatures ARE supposed to be thought of as 'single characters' and other times they are not. so it is best to let the system help with these types of decisions rather than rolling your own since the OS has a a lot more data to work from.
I'll talk about the find/replace issue another day. :-)
This post brought to you by ় (U+09bc, a.k.a. BENGALI SIGN NUKTA)