Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
Over in the Suggestion Box, Daniela Semeco asked:
Hi Michael, I have enjoyed reading your blog a great deal. I am inventor, and I'm looking for a keyboard device driver developer. Could you please recommend someone to me or suggest where to look? I've done what I could with MSKLC and KBDEdit, and I realize that this will not suffice. I need something to be written from scratch. I am in San Francisco and would like to hire someone in the Bay Area. I would feel more comfortable meeting them in person, especially since it involves intellectual property. I'm looking to have software developed for Windows, Mac and Linux OS, as well as for iPad and Android tablet. I would appreciate any feedback. Many thanks! -Daniela
The landscape in this space is indeed bleak.
My first thought would have been Mark Durdin and Tavultesoft, though despite his visits to the Bay Area and Seattle, he is still located in Australia.
He should be up here and down there in the near future if you wanted to meet, and I know he has dug deeper into the keyboard space than anyone outside of Microsoft than anyone I've ever known (and most of the people inside, too!).
He may reach out to you after seeing this blog, if not feel free to reach out to him....
Beyond that, I've been pretty unimpressed with others I've interacted with in this space.
And I've interacted with a lot of them over the years.
It's hard to say more without knowing more about what you're looking for; perhaps there are people who would be helpful in some specific area?
Or if you're at the Internationalization and Unicode Conference in Santa Clara in October (site), you can see me there and tell me more....
The other day my blog The Locales of Windows 8, not yet divvied up... went up.
Some people made observations.
Some people asked questions.
A few of them even did all that in the comments....
Like Andrew West, who commented:
I've said it before, and I'll say it one last time, I think the MS spelling of ᠮᠤᠨᠭᠭᠤᠯ (m-u-n-g-g-u-l) for the more usual ᠮᠣᠩᠭᠣᠯ (m-o-ng-g-o-l) is bizarre. On a different matter, can you explain why Basque, Catalan and Cherokee repeat the language name in parentheses instead of giving the name of the country in which the language is spoken, as is the case with all the other locales on the list? And why is "Iran" present in the native name for Persian but omitted from the display name?
I forwarded on that Mongolian comment previously, but I will do it again. :-)
For Persian I'm not sure - maybe that whole "for the ex-pats, with no contact with the country" thing came into play.
I think the fact that we don't say it's Farsi anymore was the biggest issue for them.
Those three cases where the BCP-47 tag has a country/region but the name does not -- and the fourth case involving moh-CA aka Mohawk (Mohawk) -- are I think just cases where the locale is covering dependent nations (not sovereign nations) where it just makes people more comfortable. Just trying to be a little respectful....
And then Azarian pointed out some of the same things and some new things:
Interesting: English (Caribbean) having weird code of en-029. English (Republic of the Philippines) having native name of English (Philippines). Cherokee (Cherokee) and Catalan (Catalan) with a non-country country name. Persian with no country name at all. Norwegian, Bokmål (Norway) being the only Display Name with a non-ASCII character. And no less than 6 Sami locales.
Now the English (Philippines) vs. English (Republic of the Philippines) case is just a side effect of the English display name coming from us and the native name coming from the native reviewer.
It is a good candidate to clean up at some point -- we don't have a good reason to use the long name if the language expert doesn't think it's important! :-)
The Norwegian, Bokmål (Norway) case is a longstanding issue we've just always had, in part because no Norwegian we've ever worked with or talked to found it acceptable to use Bokmal instead.
As for that code, Doug Ewell explained it:
@Azarien: '029' is the UN M.49 code element for the Caribbean.
It's a little embarrassing, but we had this English (Caribbean) locale, but we had no ISO-3166 code for it. So someone found that 029 and we went with it.
There's probably something smarter we could have done later via BCP-47 but no one wanted to change the name again, so it stayed that way.
As for the "no less than 6 Sami locales" -- we have nine Sami locales!
Doug Ewell's other comment:
'ko-KR' should not expand to "Korean (Korea)." The name of the country is Republic of Korea, or South Korea, or whatever. Just because North Korea is isolationist and the is no localized version of Windows for it, that doesn't mean it doesn't exist. This is particularly noticeable given the indulgent "Bolivarian Republic of Venezuela."
Since we don't have North Korea or DPRK on the list, the usage of South Korea is not necessary -- it's the only Korea we ship to. :-)
So it is our "short name"....
I wouldn't consider the Venezuela name to be indulgent -- but someone asked us to update the English display name so we did. It's the kind of thing we could probably do for English (Philippines) at some point.
The last comment there was from Paul B.:
Great. Next, tell us the delta from Win7 to Win8 - what's new and what's changed? :-)
Now that would have to be a whole new blog, some other day!
I was disappointed no one asked about the difference with the display name and native name of Macedonia, but maybe people knew I already covered it....
You may remember prior blogs like Regarding the overthinking and underimplementing of names from last year that talks a bit about the names we cart around or Maybe they're just showing off their fancy fonts? ;-) from a few months ago that openly pointed out one of the big problems of using native languge names in user interfaces.
The simple fact I pointed out:
The odd one out here is the Native Names, since each one is most often provided by a different person, one per language - sometimes based on preferences or standards requirements or grammatical rules in each language!
Anyway, someone was asking again about the different capitalizations in different languages in some other UI was showing.
And someone else, in partial response, pointed out that someone else owned the data, but it may be per language choices. And it continued:
They may also know, assuming that’s true, if there are scenarios in which it’s appropriate for a component for force capitalization.
It is true.
But we don't track when languages would be okay with capitalizing being done by someone else later.
There is no LOCALE_INATIVENAMECANBECAPITALIZEDIFYOUREALLYREALLYWANTTODOTHAT flag
I shudder to think of how we would manage all that, myself.
Luckily, the person asking just went ahead and resolved the issue as by design, since it is. ;-)
Now in the past, I've written The Locales of Windows 7, all divvied up, which included:
I've also written the sequel, The Locales of Windows 7, divvied up further, which included the slihttky more niche:
Since then, I've also written The evolving Story of Locale Support, part 13: Divvying up locales, yet again!.
But one list has never yet been published.
And I'm gonna publish it now. :-)
Table 8: The locales of Windows 8 (full list)
I'll divvy them up another time.
For now I'll love the list!
Previous blogs from this series:
Back in the end of June, Todd Brix wrote a blog on the Windows Phone Developer Blog entitled First look at the Windows Phone 8 Marketplace.
In it, there was a huge list of regions, covering large parts of the world.
180 places being covered by the AppHub and the Marketplace.
This is huge, as is the already announced news that Windows Phone 8 would be heavily using the Windows 8 core.
A core that includes the Windows 8 locale data!
The Windows Phone 8 team has on it many people who have been holding our feet to the fire as they had the combination of the joyful experience of more coverage and updated data with the painful experience of changes and even a few bugs.
But it was really amazing to work with them and partner with them here -- we are a better product because of the work they were doing as a part of their product....
One of the most frequent sources of werdness was inconsistencies between data fields that would reasonably be considered related -- like LOCALE_SSHORTTIME and LOCALE_STIMEFORMAT.
In the end, when they were different it was just due to historical reasons, and the lack of anyone reviewing them for differences to date.
That kind of cleanup has started largely due to their diligence and reports, and further cleanup in the future will definitely take their issues into account.
Plus it is really cool to see these things on Windows Phone 8!
It is well known that intl.cpl is a relentless beast that does what it is asked to with cunning nd ruthless efficiency.
If you know how to ask for it, at least. :-)
The question from the other day was simple enough:
After installing language pack on a English OS, the welcome screen language can't be changed back to English if it was changed to another language by running following command in system context.control intl.cpl,, /f:"settings.xml" setting.xml contains:
<gs:GlobalizationServices xmlns:gs="urn:longhornGlobalizationUnattend"> <gs:UserList> <gs:User UserID="Current" CopySettingsToSystemAcct="true" /> </gs:UserList> <gs:MUILanguagePreferences> <gs:MUILanguage Value="da-dk" /> <gs:MUIFallback Value="da-DK" /> </gs:MUILanguagePreferences> </gs:GlobalizationServices>
I don't know about you, but I like it when the answer is right there in the question!
Did you see it?
It is in the <gs:User UserID="Current" CopySettingsToSystemAcct="true" /> line.
Because by setting the user account and requesting that the setting be copied to system accounts, one has formally requested that the copy take place.
The user context/system context difference is simply whether the user running the script has permission to make the change!
Just like in this UI in Windows 7/Server 2008 R2/Windows 8/Server 2012 intl.cpl:
or this one in the Vista/Server 2008 intl.cpl:
The follow up question was:
Is there a way to revert this change on machines where this XML file was already used and the machines are in problem state?
Unfortunately, no.
It doesn't store any previous settings.
And there is no mulligan....
So, to fix, one would have to:
When this unattend feature was being created in Longhorn (which became Vista), this very scenario was discussed.
But the fact that the UI didn't support it made it less interesting, so the idea was dropped, especially since just running two files could get the desired effect.
Just like steps 1 and 2, above!
The note from the Suggestion Box was:
Loading keyboard dlls in a 64-bit environment using a 32-bit application
Hi Michael!
I've been enjoying you topics regarding the keyboard dlls (kbd**.dll) files and how to mess with them :)
Earlier I decided to create a onscreenkeyboard of my own, based on scan codes, so I could take the advantage of these existing keyboards dlls.
In my path of development; I did not get it to work on a 64-bit system, the pVkToWcharTable was always NULL. It only worked if I compiled a 64-bit version of my app.
I've written a class to actually manage 32-bit app on a 64-bit system, and it works. I've even shared it to the public, by writing an article: http://lars.werner.no/?p=870
Since you always investigate deep, do you have any root cause of what actually fails?
Hope to see an article on that later on!
Cheers,
Lars Werner
http://lars.werner.no
We don't actually document how to act like Windows does when it uses one of these keyboard layout DLLs.
Previous travails, discussed in The wacky world of WOW64 keyboards, un-leashed, un-locked, un-something-or-other and If you just don't think you can hold it (64-bit style!), may seem relevant.
But they aren't -- not directly at least.
Yet by looking at what happens differently in 64-bit builds, one can reverse engineer the differences for 64-bit builds either directly here or indirectly by looking at kbd.h and how it defines the structure differently for the 64-bit case.
I'd help, but that would be a Microsoft employee doing the work to document the stuff that we decided not to document....
The latest question that passed through the gauntlet was:
I’m working with .NET and I need to be able to map a set of two-letter country codes (e.g. US, IE) into the appropriate RegionInfo object. I’ve got this working correctly for all regions defined in .NET, but we’ve recently discovered that there are a number of regions which are not present in .NET including Andorra and Antigua and Barbuda. I realise we can overcome this problem by creating custom cultures, but this seems like it might be a maintainability issue. In case it’s relevant, my team is using .NET 3.5 due to various limitations for now.
What a nightmare to create that many custom cultures, none of which would likely have useful data..
All without getting reviewed the way we review data now! :-(
I think the wider list can be picked up with a simple p/invoke call to EnumSystemGeoID instead.
GEO stuff has limitations (as I mentioned in the past), especially around time zones and such, but for names it should work pretty well!
Say code like the folllowing:
using System;using System.Text;using System.Runtime.InteropServices;
class Program { private static string stTarget;
static void Main(string[] args) { if(args.Length > 0) { stTarget = args[0]; } else { stTarget = string.Empty; } EnumSystemGeoID(GEOCLASS_NATION, 0, GeoInfoProc); }
private static bool GeoInfoProc(Int32 geoID) { int len = GetGeoInfo(geoID, GEO_FRIENDLYNAME, null, 0, 0); if(len > 0) { StringBuilder data = new StringBuilder(len); len = GetGeoInfo(geoID, GEO_FRIENDLYNAME, data, len, 0); if(len > 0) { string stIsoCode = data.ToString(0, len - 1); bool fFound = string.Compare(stIsoCode,
stTarget, StringComparison.OrdinalIgnoreCase) == 0; if(stTarget.Length == 0 || fFound) { Console.WriteLine(string.Format("{0} = {1}", stIsoCode, geoID)); } return !fFound; } } return false; }
// Geo Type private const uint GEO_NATION = 0x0001; private const uint GEO_LATITUDE = 0x0002; private const uint GEO_LONGITUDE = 0x0003; private const uint GEO_ISO2 = 0x0004; private const uint GEO_ISO3 = 0x0005; private const uint GEO_RFC1766 = 0x0006; private const uint GEO_LCID = 0x0007; private const uint GEO_FRIENDLYNAME = 0x0008; private const uint GEO_OFFICIALNAME = 0x0009; private const uint GEO_TIMEZONES = 0x000A; private const uint GEO_OFFICIALLANGUAGES = 0x000B;
// Geo Class private const uint GEOCLASS_NATION = 16; private const uint GEOCLASS_REGION = 14;
private delegate bool EnumGeoInfoProc(int geoID);
[DllImport("kernel32.dll")] private static extern bool EnumSystemGeoID(uint geoClass, int parentGeoID, EnumGeoInfoProc enumGeoInfoProc);
[DllImport("kernel32.dll", CharSet = CharSet.Unicode, EntryPoint ="GetGeoInfoW")] private static extern int GetGeoInfo(int geoID, uint geoType, StringBuilder geoData, int size, int langID);
}
It will build a nice list that happens to include all of the various names requested....
This code or code like it should work just fine with 3.5 or 4.0 or even 4.5.
So screw R56egionInfo! You can get your GEO on, instead....
The other day, Peter Parker (probably not his real name) asked:
How can I determine if a font is fixed width, like Visual Studio does?
Well, he certainly can't rely on his spidey sense even if he was the real Peter Parker, since Spiderman doesn't scale! :-)
When he referred to "like Visual Studio does", he meant the Bold items on the list:
The principle is simple enough:
You can see many examples of this being correctly done for Meiryo/Meiryo UI (since the Latins are based on Verdana) and Microsoft YaHei/Microsoft JhengHei (since the Latins are based on Segoe UI) and so on.
It may even interest our Marvel action hero enough to want to emulate the checks.
However, there are three problems here;
I suppose there could be an IsfixedWidth method where you can pass the font name and some text.
It could then return whether all of that text you passed in is the exact same width per character!
For now, you are mildly on your own checking for this.
Even if you spin a web any size and catch thieves just like flies...
For the record, this is not a political Blog.
Though I suppose this one posted blog is.
There is no way I want to make this a common thing.
And it goes without saying that Microsoft is not involved....
I just have a question.
Now politics in the US are nice and hyper-polarized, as pictures like this point out:
Whatever.
This is not really about the diagram.
But a lot of the rhetoric argues between liberals who support the Buffett Rule versus conservatives who argue this would affect job creation and some say investment as well.
Okay, let's momentarily stipulate that both points of view have merit.
Maybe they do, at some level, after all.
Now both job creation and investment in the economy are understandable and potentially measurable things.
So why not just measure them?
Why not hinge the tax rate of capital gains on whether or not jobs are created or the investment in question measurably helps the economy in some way?
How come I have never seen such a plan suggested?
I mean, I doubt anyone would try to publicly defend the person or company that lays off 500 people and hurts the environment willingly as deserving better tax rates than the one that hires 500 and helps build a highway.
And if you accept that, then all you need to do is figure out what rates to use for what.
So let's tax them according to what they accomplish with the money, and pass a bill!
Simple? Of course not. But we are paying our lawmakers to solve tough problems.
Like this one....
Previous blogs in this series:
I knew at some point I'd have to deconstruct the support of International Domain Names in Internet Explorer.
In other words, IDN in IE. :-)
Unfortunately, it is pretty complicated.
So I was looking forward to it like I'd look forward to a root canal, you know?
Thankfully, Eric Lawrence saved me the trouble!
In EricLaw's IEInternals, he wrote Brain Dump: International Text last month, he explained the IDN settings in this picture:
,
He also covered all of these "International' settings from the dialog.
In particular:
Send IDN server names is enabled by default and will force IE to encode hostnames in URLs following the rules of RFC3491 and RFC3492. The user will be shown the URL in the address bar in Unicode form if and only if the URL is deemed non-spoofable. Please see this IEBlog post on the rules of IDN Non-spoofability.
Send IDN server names for Intranet addresses is disabled by default for compatibility with legacy Windows networks that were using UTF-8 to support non-ASCII hostnames. Other browsers, to the best of my knowledge, do not have special handling for Intranet sites, and I believe that current versions of Active Directory and the Windows DNS server support punycoded hostname registration and lookup.
Since he did all this work, it saves me the trouble!
Kinda great teamwork, he and I.
I guess there 's no "I" in IE, either! :-)
Regular readers may remember my Four cases where I don't like ResolveLocaleName (and you shouldn't either!) from a few months ago.
I described a terrible sitation with ResolveLocaleName.
Ths function, added in Windows 7, would do all of the following:
and I pointed out how this function works. It works badly.
Well, one person took this to heart, my friend Brendan.
He entered a bug on this for Windows 8 (I was actually just looking at the Win7 repro but it was I guess in Win8 too!).
Anyhow, despite how late it was they did do a quick fix!
So, once you have RTM bits, the results are much more reasonable:
Wow -- much bettter!
Now, we just need to port it to Wndows 7... ::-)
So, the other day, I shared a picture on Facebook.
Nothing I haven't done before,
As usual, I was just sharing someone else's picture (I seldom create pictures myself).
Just passing the time.
Sometimes people ignore them.
Other times people "Like" or even "Share" them.
Every once in a while one might inspire a kiss from you-know-who.
Anyway, then something got my attention.
Looking at the posted picture, I was given a new option:
What the hell?!?
For the record, and to all my friends, I will never pay to announce any post. if you see it and like it then GREAT.
But my value system in its current form requires the approval of just one person. And she isn't for sale, with or without Facebook pimp proxy.
I hope Zuckerberg has less creepy ways for Facebook to make money post-IPO than the "buy more approval from your friends" technique.
Probably good that friends won't be told someone announced posts this way, I can't think of a better metric for unfriending....
Some questions confuse me even when I know the language in which they are asked fluently.
In my experience, the most common cause is when the person asking the question is trying to connect two things together that lack a direct, useful connection.
In such cases, understanding what the true question involves kicking the living crap out ofdisproving the "wrong" question, and help the right one emerge.
Take for example the following question sent the other day:
Is there a table that shows what the delta is between [the list of supported input languages] and [the list of system locales]? Hindi is one of the examples, and I am interested in a comprehensive list.
Hmmm.
Okay, we'll start by defining the two terms:
[the list of supported input languages] - Mostly you can go to HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layouts\ and enumerate the subkey under it. This will give you list of the keyboards.
IMEs are not included, but all of the existing IMEs fall in two categories:
Thus, all of them are either Unicode-only or Unicode-only enough for current purposes that they may as well all be treated that way.
For the keyboards under that registry key, every subkey contains a KLID; if converted to hexadecimal number, then LOWORD(<KLID>) is usually but not always a LANGID.
You can pass those LANGID values to GetLocaleInfo(LOWORD(<klid>), LOCALE_IDEFAULTANSICODEPAGE, ..) and if you get back a 1 then you have a Unicode-only locale.
Note that if LOWORD(<klid>) is 0x0c00, then it's either one of those locale-less keyboards we added in Windows 8 (most but not all of which are Unicode-only) or a custom keyboard created by MSKLC based on a custom locale (which does not have a known code page).
Of the rest of the keyboards that do have ACP values, many of them contain characters outside of the corresponding default system code page.
Now while a LOCALE_IHASLETERSOUTSIDETHECP would potentially quite useful, we don't have that. And it may not match the keyboard anyway.
Not to mention that there is no intrinsic queryable property or attrbute of a KLID that can be used to easily identify what a keyboard supports.
SUMMARY: for almost every keyboard, you cannot find out whether the keyboard corresponds to a code page.
Any code page.
Popping the stack from this disaster for a moment, let's o back to the original question.
Now [the list of system locales] is a bit less messy to get.
Just EnumSystemLocalesEx will let you get that list, and if needed you can use GetLocaleInfoEx(<enumerated name, LOCALE_IDEFAULTANSICODEPAGE,...) not returning 1 but returning any other number to mean it's a valid potential system locale.
Easy, and even supports custom locales!
Of course they all have some overlap.
In the end, they only needed the second part anyway for what they had in mind - the real question was about working with characters off the default system codepage; keyboards were the easiest way to repro the problem and thus the repro overcomplicated things.
But now that you wade through either list, and even try to match them up where they overlap, a reasonable person can come to just one conclusion.
It isn't 1993 anymore.
If you depend on code pages then...
Sean asked me the following contact request which I decided to make a blog:
MSKLC allows entry of unicode characters but complains if they aren't in the locale codepage. But there doesn't seem to be a way to set the codepage -- ugh. How do i tell - the keyboard/windows/whatever - to just work in Unicode and forget all this 1258 nonsense??? heh... sorry for my 4am desperate ramblings...
The message did indeed arrive at 4:26am. ;-)
The answer can be found under View|Options... and the Options dialog it launches:
That CheckBox over near the bottom right, the one that says Include Validation Warnings Related to Codepages one.
The help file topic explains how it works:
I'll tell you a secret.
Every single option on that dialog came from feedback from either the alpha pre-release or the beta pre-release, when someone gave me a keyboard layout with weird or strange behavior.
Like Sean's report, but earlier. :-)
At first I always warned, but then someone built a Hindi (Unicode only) keyboard.
So rather than a page of warnings covering nearly every key, I made it silently skip that validation for Unicode only locales.
Then someone else tried to build an Urdu keyboard -- code page 1256 but with many letters not on the codepage -- and she wanted a way to turn off those warnings entirely since those particular ones weren't useful in her case, ever though they could be important in other situations.
That's when I added that CheckBox.
Later, when someone building 15 different Arabic script keyboards who was sick of having to uncheck that CheckBox over and over.
Suddenly, the Remember Settings After Shutdown setting was born.
And so on....
Anyway, that should help Sean out.
Another time I'll tell you the exact reason why I was so sensitive about tons of spurious warnings....