Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
So, the question that came up the other day was an interesting one (identifying product details removed both obvious and not-so-obvious reasons:
Michael, I’m a dev working on a feature kind of like “mail tips” and allow the text to be displayed to be customized by the Administrator. That customized text may be provided in any number of languages, and Outlook should display the one that most closely matches the user’s language settings in Outlook. I’m contacting you because someone on my team recommended you as an expert in internationalization. I’m looking to get your expertise and knowledge and history in this area to hopefully help my team gain a better understanding of the issues and to hopefully guide us in the right direction. Our scenario is as follows, the Administrator can specify a set of localization strings, each with an associated “locale” (whether that’s locale name, or an LCID, or something else is unclear – I’d love to get your guidance on this). Outlook is running in a given locale, or the “default” windows locale – which might not have a corresponding Outlook localization strings. This case is of particular interest. We need to support the administrator localizing his custom string into this specific locale (even though office doesn’t have localized strings for it). Additionally, consider a case where the administrator hasn’t localized this string, into this very specific locale, but he or she localized it into a “very similar” locale – we’d want to display that similar locale. For example, imagine if the Administrator localized the string into French, but the user’s settings were for Canadian French – presumably we would want some way to figure out what is the “best” choice given a desired locale to display and a set of available locales to choose to display. What I’m really asking for is some help in what to use to identify locales and how to accomplish this “best-match” algorithm. Thanks in advance for your help.
One of the core principles that international support in Windows/Office subscribes to is that there is a difference between the language of the user interface and the language specifying how to format/collate information (SQL Server works under slightly different core principles -- with the split being codepages/collation vs. UI language/formatting).
You can probably see some of the confusion in the question relating to what is a part of the user interface language and what isn't.
We actually have only a very limited number of cases where we support different UI languages that share a common language across two or more locales (and Canadian French vs. Parisian French isn't one of them are thet we really pay heed to).
In fact, despite other potential cases I've mentioned in the past (e.g. Arabic, English, French, Spanish), just:
are all we have in this limited category. Those other business cases simply have never been made convincingly enough, no matter how "self evident" their truths appear to be to folks like me! :-)
When you think about the nature of the differences, one can make a very strong argument to avoid fallback in these limited cases since having the wrong Chinese or the wrong Portugese pop up can be a huge customer satisfaction issue. Having it served up as the UI lanuage seem for the sake of fuzzy matching, or more clever fallback....
The one reasonable case for fallback is in partially localized SKUs -- Language Interface Packs.. But in that case, the system does a good job of fallback al on its own, and there is no need to add fuzzy fallback mechansms atop what exists.
The best tool you can use here is one of two resource loaders (Win32 vs. managed), which will then try to do the best fallback it can.
Though another concern raise itself in my mind: would this planned architecture make "OpenLocalization" a reality for some products? Imagine being able to change lots of the localiztion effort based the user's UI Language. Though the extent of localization that could be overriden is unknown, since the curent scope seems fairly limited (to some specific administrative error messages, where updates would probably be more likely to hit usability concerns than dialect issues)....
Stephen asks:
Hi,
We have been trying to use Welsh and found that the NLS table (http://msdn.microsoft.com/en-gb/goglobal/bb896001.aspx) maps Welsh to codepage 1252.
This has confused me because Welsh can contain Ux0174 (and others) that are not in 1252. (http://www.microsoft.com/globaldev/handson/user/welsh.mspx)
Is there going to be any changes to this mapping to create a Welsh codepage, or an equivalent of Celtic?
ThanksStephen
Unfortunately, we really are out of the code page creation business. It was never terribly lucrative.
And almost every language fails the test here now....
So Welsh has lots of distinguished company.
But use Unicode to support them all!
Must resist temptation to make a "Welsh/Welch pun"....
Reader Angel asked about Digit Substitution:
Hello Michael!
I randomly found your blog while googling about digit substitution. I read all the entries in your blog about this subject, but didn’t find an answer for my problem. However, I think you might be able to help me.
I’m working on a MFC application which uses several custom controls. One of those custom control displays text using the DrawString method from GDI+, and digit substitution is not working very well on it. I’m getting the following results:
|User Locale (Format)| Standard digits | Use native digits | Displayed digits | --------------- --------------------------- ------------------- ------------------| English (USA) | Arabic | National | Arabic || English (USA) | Hindi | National | Arabic || Arabic (Egypt) | Arabic | National | Arabic || Arabic (Egypt) | Hindi | National | Hindi || Arabic (Egypt) | Other strange digits | National | Hindi | ------------------- ---------------------- ------------------- -------------------
So for me it looks like the substitution is only performed if the selected ‘Standard Digits’ are in the codepage for the locale. Because of this, digit substitution only works with the Arabic locale and Hindi or Arabic numbers. If the ‘Standard Digits’ are not in the codepage, then the national default ones will be used: Arabic in the case of English and Hindi in the case of Arabic. I know codepages and unicode are very different things, but I cannot think of any other explanation.
The application is Unicode, and digit substitution works everywhere else (buttons, window titles, etc) but in this custom control, which is used to display the static text in all the application. The TextFormat the DrawString method uses has DigitSubstitution set to StringDigitSubstituteUser (although I’ve tried all the other values with no good results). LOCALE_IDIGITSUBSTITUTION == 2 and LOCALE_SDIGITSUBSTITUTION returns the “Use native digits” digits when I check their values just before DrawString.
Any ideas?
Please, tell me if you need any additional info or clarification about the problem.
Thank you very much!!!Angel.-
It isn't exactly about code pages, since the Arabic code page, Windows code page 1256, only contain the ASCII digits.
In fact, most of the locales that explicitly support digit sustitution don't even have code pages!
Thus the code page itself can't do much here.
The real problem (to the extent it is a problem, a point I'll discuss in a moment), is that GDI+ doesn't really play well when the digits are overridden to not be the kind that typically gets used for the locale.
By letter of the law, well of the user preference, whatever "batshit crazy nonsense" you set in the dialog is what should be used.
Though in practice, the indirect way the digits are used in the original GDI and Uniscribe cases (ref: Digits -- there is no substitute and Windows doesn't let you choose the pinch hitter in digit substitution cases) leads to lots of of confusion anyway.
I feel like the GDI+ insistence on having some consistency here is probably a forcing function to take some really potentially whacked out functionality and be a little more sensible.
In the end, you have to decide how much sense you think this UI really makes:
Bonus points if you know what the last row is without setting it and programmatically checking the results, and even more bonus points if you know why they don't show up!
Today is my birthday.
In fact, it is the 16th anniversary of my 25th birthday today! :)
I have decided to do something special with my birthday this year.
I am donating it!
Anytime unemployment is high, a favorite story is about the unemployment rate among college graduates, and whether these stories are based on some study or poll or whatever, they tent to be awash in the perceived irony the situation provides.
Fair enough.
But what most of those stories don't bother to point out is that on the whole college graduates do better for their lifetime income than those without degrees.
Because one you get past all the irony, it is impossible to ignore that is is hard get a degree without learning. Not just about what the classes and tests and homework are all about, but about learning how to learn more.
Education is about bettering oneself, which increases opportunity.
Anyway, so I am donating my birthday to , a company whose purpose is summed in its Wikipedia article:
Vittana is a non-governmental organization that allows people to lend money via the Internet to students in the developing world. It is a 501(c)(3) non-profit organization headquartered in Seattle. Vittana focuses on student loans because student loans are nearly unavailable in developing countries.
Lenders browse through a list of students and select who they wish to fund. The loans issued by Vittana range from $200 to $1,500 and are funded by individual lenders through its website. The loans are disbursed when Vittana has aggregated sufficient money from donors to cover the education expenses. One hundred percent of the loans are given to the student. A mother or a close relative acts as a co-signer. The recipient of the loan can repay the loan after landing a job. Vittana students have had a 97% repayment rate.
Vittana has collaborated with several organizations such as the Clinton Global Initiative, Brigham Young University, frog design, Grameen Foundation, HOPE International, Lex Mundi, Orrick, Perkins Coie, Pop!Tech, Unitus and University of Washington, Seattle. In collaboration with Clinton Global Initiative as a key education partner, Vittana partnered with Africa’s microfinance institutions to launch lending programs which fund post-secondary education for young people. The loans are able to help 10,000 students complete higher education by 2015. Amazon.com has backed Vittana's educational loans.
Or you can see how it works from the Vittana site directly...
Now there are several different ways to help here:
If you work for a company like Microsoft or Boeing or now Apple, then the second and third bullets are eligible for corporate matching (disclaimer: Vittana isn't set up to let you choose the exact students who et loans with the matching funds, in part due to the complicated nature of how soon and how often money is disbursed -- but money matching loans is used for other loans.
These is no shame in the first option, either -- you are still making live better even if you would rather be repaid later.
You can even mention my name if you want (I make no cut of the money obviously, but I'll get a nice email about having inspired you, which is a cut enough in my book!). And a more awesome birthday present in honor of this, the 16th anniversary of my 25th birthday, than pretty much anything else you can come up with.
Less than 8 hours ago, I gifted $2,427.00 split between 11 different students. In most cases I "finished off" the amount needed for each loan, so in the majority of those 11 cases I get to [selfishly] get to be the person who got the loan disbursed to the students. I love that feeling, I'll admit it.
Today I'll put in the matching so the Vittana will get $4,854.00 for my birthday, and the lives of a bunch of people will be made better. And after those loans are repaid Vittana will use the money for more loans. And even more people are helped....
It feels good to give, it truly does. Thank you Vittana for helping me make lives better!
My interest in understanding how to understand getting into appropriate formal garb can traced back to a Heinlein story by the name of Double Star, a story whose principal flaw was that it so good that it really should been much longer:
I had no time to explore the apartments; they dressed me for the audience. Bonforte had no audience even dirtside, but Rog insisted on "helping" me (he was a hindrance) while going over the last-minute details.The dress was ancient formal court dress, shapeless tubular trousers, a silly jacket with a claw-hammer tail, both in black, and a chemise consisting of a stiff white breastplate, a "winged" collar, and a white bow tie. Bonforte's chemise was in one piece, because (I suppose) he did not use a dresser; correctly it should be assembled piece by piece and the bow tie should be tied poorly enough to show that it been tied by hand -- but it is much to expect a man to understand both politics and period costuming.
Even at that young age and prior to any "Bacon number" scandals that would have disallowed an entry into politics, I knew that I would not be holding some elected office in my future. So I decided understanding about dress clothes would work in my favor.
I bought my first tux just after my 21st birthday, and most recent tux just shy of me 41st. And I found myself needing a tux often enough that both times they paid themselves off within months.
Of course thing had changed over that last couple of decades.
These days whatever limited means I have in charm, I lack in basic finger dexterity. So in the new tux at the Rene Ropas prom recently, it took me a good 30 minutes to button the shirt, and I was unable to handle the shirt studs or cuff links (I had to fall back to the the built-in plastic buttons. I couldn't even button the top button at all, and you see peeking out of the jacket is the clip-on tie, because having learned to tie a bow tie more than three half-decades ago, now I can't even fasten the clip-on.
And I seriously needed a shave, obviously.
People thought I looked classy, but the basis of comparison was somewhat suspect -- this is a fairly under-dressed town, if you know what I mean.
I really don't like this incomplete job. I want a shirt studs.
And cuff links.
And bow ties tied by hand, just poorly enough enough so that you can tell they're tied by hand.
For all the events to coming up....
I am going to a Prom 2.0 next weekend (her dress will match my cummerbund which will match the corsage I get for her which will match the boutonnière she will get for me).
And I have symphony tickets this season.
Not to mention a charity do or three.
And some weddings.
I want to be dressed right for all of these things.
So I've decided I need a dresser -- a female valet to help me with all of these things that come up from time to time.
Part time because it would be an occasional thing.
Female because I have always felt a little uncomfortable getting ready in situations where I couldn't do things -- but in the past a girlfriend or friend just felt more comfortable.
Most of my friends and especially my close my friends (and all of my girlfriends) have always been female....
But I want a "staff" person because I'm not trying to indenture a girlfriend or friend of this job.
So a valet of a of sort.
But this is not some clever euphemism to work around sex for hire jobs on Craigslist, which is probably what it would be taken for on the site. Which is not the goal, in this case....
And I would just as soon like to avoid that kind of misunderstanding.
Man, is this a "First World Problem" or what? :-)
It was not quite a year ago, in Myth busting in the console, where Myth #12 was:
Myth #12: You can't change the setting of whether a console window is using a TrueType font.
This myth too is quite untrue.
I have used a few console API functions and that IsConsoleFontTrueType function from this blog to change the font within a console window to a TrueType font, from code running in the console window.
This is something I would never recommend in production code, mind you; I only did it because someone told me it wasn't possible and I was sure she was mistaken.
The impact of the accomplishment was interesting, mind you; she and I dated for about a month after that. ;-)
And the story there may be worthy of its own dedicated myth-busting blog, along with the code itself. If people are interested, I mean. Let me know....
I've had some people ask for the code, so I thought I would oblige...
Others wanted top hear the story about the relationship. Well let's just say it only lasted for a month. Like this code, it really was not something that worked in production!
Anyway, here is the code I wrote at the time:
using System;using System.ComponentModel; using System.Runtime.InteropServices;
namespace Test { public static class ConsoleStuff { public static void Main() { SwitchToAnyTrueTypeFont(); }
internal static bool SwitchToAnyTrueTypeFont() { IntPtr stdout = GetStdHandle(STD_OUTPUT_HANDLE); uint nFont = 0;
// First see if we are truetype already if (IsConsoleFontTrueType(stdout)) { return true; }
// Get the current index to set it back if no font is found. CONSOLE_FONT_INFO cfi = new CONSOLE_FONT_INFO(); if (GetCurrentConsoleFont(stdout, false, ref cfi)) { nFont = cfi.Index; } CONSOLE_FONT_INFO[] fonts = new CONSOLE_FONT_INFO[GetNumberOfConsoleFonts()]; if (fonts.Length > 0) { GetConsoleFontInfo(stdout, false, (uint)fonts.Length, fonts); }
// // Try to set each font until a TrueType one is found. This // part is hacky, but AFAIK there is no way to get TrueType // status on any fonts other than the current one. // for (int iFont = 0; iFont < fonts.Length; iFont++) { if (SetConsoleFont(stdout, fonts[iFont].Index)) { if (IsConsoleFontTrueType(stdout)) { return true; } } }
// Could not find a font, so give up SetConsoleFont(stdout, nFont); return false; }
internal static bool IsConsoleFontTrueType(IntPtr std) { CONSOLE_FONT_INFO_EX cfie = new CONSOLE_FONT_INFO_EX(); cfie.cbSize = (uint)Marshal.SizeOf(cfie); if (GetCurrentConsoleFontEx(std, false, ref cfie)) { return ((cfie.FontFamily & TMPF_TRUETYPE) == TMPF_TRUETYPE); } return false; }
internal const int TMPF_TRUETYPE = 0x4; internal const int LF_FACESIZE = 32; internal const int STD_OUTPUT_HANDLE = -11; // Handle to the standard output device.
[StructLayout(LayoutKind.Sequential, Pack = 1)] internal struct CONSOLE_FONT_INFO { internal uint Index; internal ushort dwFontSizeX; internal ushort dwFontSizeY; } [StructLayout(LayoutKind.Sequential, CharSet=CharSet.Unicode)] internal struct CONSOLE_FONT_INFO_EX { internal uint cbSize; internal uint nFont; internal ushort dwFontSizeX; internal ushort dwFontSizeY; internal int FontFamily; internal int FontWeight; [MarshalAs(UnmanagedType.ByValTStr, SizeConst = LF_FACESIZE)] internal string FaceName; }
[DllImport("kernel32.dll", ExactSpelling = true)] internal static extern bool GetCurrentConsoleFontEx( IntPtr hConsoleOutput, bool bMaximumWindow, ref CONSOLE_FONT_INFO_EX lpConsoleCurrentFontEx);
[DllImport("kernel32.dll", ExactSpelling = true)] internal static extern bool GetCurrentConsoleFont( IntPtr hConsoleOutput, bool bMaximumWindow, ref CONSOLE_FONT_INFO lpConsoleCurrentFont);
[DllImport("kernel32")] private extern static bool SetConsoleFont(IntPtr hOutput, uint index);
[DllImport("Kernel32.DLL", ExactSpelling = true)] internal static extern IntPtr GetStdHandle(int nStdHandle);
[DllImport("kernel32", ExactSpelling = true)] private static extern bool GetConsoleFontInfo( IntPtr hOutput, bool bMaximize, uint count, [MarshalAs(UnmanagedType.LPArray), Out] CONSOLE_FONT_INFO[] fonts);
[DllImport("kernel32", ExactSpelling = true)] internal static extern uint GetNumberOfConsoleFonts(); }}
You get the point. Just don't get crazy on on it....
Previous parts in this series:
Let me see, was i in the last part talking about:
ISSUE 3: Can machine names support the full range of Unicode too?
This issue is the reason I put IDN in quotes -- because the only real connection it has to IDN is how many different services/applications on each machine -- like IIS or Terminal Services or Remote Desktop or SQL Server or whoever -- that uses the machine name by default.
In some cases, those services and applications have a way to create other names, and in some cases they don't.
But the fact that the default is so often the machine makes it compelling in the former cases and required in the latter.
For this issue, the support is incomplete.
This incompleteness is something I'll cover in the next part of the series.....
This ISSUE #3 is a really good way to see there are four kinds of actual definitions of TRUTH for Windows/Microsoft technology:
Now in the ideal world, all four amount to the same thing; in practice this seldom the case.
Now the general problem is that there are two computer names to deal with: the NetBIOS name, and the DNS name.
NetBIOS, as is described in Wikipedia:
In 1986, Novell released Advanced Novell NetWare 2.0 featuring the company's own NetBIOS emulator. Its services were encapsulated within NetWare's IPX/SPX protocol using the NetBIOS over IPX/SPX (NBX) protocol.
In 1987, a method of encapsulating NetBIOS in TCP and UDP packets, NetBIOS over TCP/IP (NBT), was published. It was described in RFC 1001 ("Protocol Standard for a NetBIOS Service on a TCP/UDP Transport: Concepts and Methods") and RFC 1002 ("Protocol Standard for a NetBIOS Service on a TCP/UDP Transport: Detailed Specifications"). The NBT protocol was developed in order to "allow an implementation [of NetBIOS applications] to be built on virtually any type of system where the TCP/IP protocol suite is available," and to "allow NetBIOS interoperation in the Internet."
Now those two RFC have e some contradiction in the NetBIOS naming rules, as described here and in the [MS-NBTE] protocol doc:
[RFC1001] and [RFC1002] are confusing with respect to the definition of the name syntax. [RFC1001] section 5.2 states: "The name space is flat and uses sixteen alphanumeric characters. Names may not start with an asterisk (*)."
[RFC1002] section 4.1 states: "The following is the uncompressed representation of the NetBIOS name "FRED", which is the 4 ASCII characters, F, R, E, D, followed by 12 space characters (0x20)."
It should be clear from the previous statement, because an asterisk and space characters are not letters or numbers, that the term "alphanumeric characters" is confusing at best.
This document clarifies the ambiguity by specifying that the name space is defined as sixteen 8-bit binary bytes, with no restrictions, except that the name SHOULD NOT<2><3> start with an asterisk (*).
Neither [RFC1001] nor [RFC1002] discusses whether names are case-sensitive. This document clarifies this ambiguity by specifying that because the name space is defined as sixteen 8-bit binary bytes, a comparison MUST be done for equality against the entire 16 bytes. As a result, NetBIOS names are inherently case-sensitive.
It is important to understand that the choice of name used by a higher-layer protocol or application is up to that protocol or application and not NetBIOS. A NetBIOS over TCP implementation MUST NOT enforce the use of the convention discussed in section 1.8.
Now that I think of it, Larry Osterman's How do I compare two different NetBIOS names? blog is helpful here for developer truths
As the SetComputerNameEx docs point out how by convention the NetBIOS and DNS names are linked and problems with breaking he link:
Sets the primary DNS suffix of the computer.
Sets the NetBIOS and the Computer Name (the first label of the full DNS name) to the name specified in lpBuffer. If the name exceeds MAX_COMPUTERNAME_LENGTH characters, the NetBIOS name is truncated to MAX_COMPUTERNAME_LENGTH characters, not including the terminating null character.
Sets the NetBIOS name to the name specified in lpBuffer. The name cannot exceed MAX_COMPUTERNAME_LENGTH characters, not including the terminating null character.
Warning: Using this option to set the NetBIOS name breaks the convention of interdependent NetBIOS and DNS names. Applications that use the DnsHostnameToComputerName function to derive the NetBIOS name from the first label of the DNS name will fail if this convention is broken.
And then there is the crucial bit, from the [MS-NBTE] protocol doc:
5. If the NetBIOS name from the LMHOSTS file is less than 16 bytes in length, pad the name with spaces, and uppercasing all characters within the ASCII range which results in ComputerName.
Note that the letters A-Z are changed by an UPPERCASING operation, which means that if the casing rules specified here are used any population of the naming table then the biggest concerns developers have told me about the OEMCP dependencies that affect casing are not true (since A-Z are the same on every OEMCP). Though this point is not entirely clear and unless someone reviewed actual code I'd frankly be suspicious anyway.
Especially given that the original NetBIOS spec required case sensitivity and even old NetBEUI specs have the same rules (after uppercasing).
Also note the UI to allow computer name change includes the ES_OEMCOVERT flag (see Raymond Chen's What is the deal with the ES_OEMCONVERT flag? blog on how crazy *that* flag is, because even in current UI the owners don't want to remove the ES_OEMCONVERT as they are worried on the unclear impact on the OEMCP on NetBIOS that one feels comfortable changing.
So even the end the fact that DNS names are tied to the NetBIOS names which may not be the impacted as much as anyone thinks when one sticks with the Unicode version of DnsHostnameToComputerName function (DnsHostnameToComputerNameW) and the ES_OEMCOVERT usage.
And thus the machine name namespace is sharply limited until:
The only happy ending to all of this is that this is that there are ways to limit the impact of the computer name issues hemorrhaging into other areas....
Oh, and the fact that functions like the DnsValidateName function have text like this:
Note If DnsValidateName returns DNS_ERROR_NON_RFC_NAME, the error should be handled as a warning that not all DNS servers will accept the name. When this error is received, note that the DNS Server does accept the submitted name, if appropriately configured (default configuration accepts the name as submitted when DNS_ERROR_NON_RFC_NAME is returned), but other DNS server software may not. Windows DNS servers do handle NON_RFC_NAMES.
So maybe one day we'll be better here...
Perhaps you read somewhere that I have Multiple Sclerosis.
It's true.
I have had people ask me about my prognosis, with my MS.
And for some people, my prognosis can even be a soup question.
But the one thing I had was some perspective.
The kind or perspective that being in the "last phase of Multiple Sclerosis, progressive MS, the one that even the pharmaceutical companies have largely given better up on, since the market for lucrative ways to sell long term hypoethtical hope for $1000 a month is covered by insurance....
But the one thing that MS has taught me even when the fat lady is singing that this show can find a whole new to start up.
In the beginning of last week I went to see my neurolgist so she renew my prescriptions (they have to see you every once in a while!), but this time I had news: I was showing some new symptoms! Like something acute, you know!
Some thing that hasn't happened for years.
She recommended a 3-day course of methylprednisolone (a steroid), kind of a mainstay of assisting in relapsing piece of relapsing/remitting multiple sclerosis....
But I was less convinced, since in the last 20 years, methylpred seemed to be less useful over time....
And then on Friday, I was at a party. A wine tasting party.
The name of the party was La Figa.
I swear I didn't know what that mean in French slang until later....
Somewhere between halfway in between half of the second glass (a nice chardonnay) and half of the second (a Syrah I had been admiring), I was stuttering.
Well, not really stuttering.
I mean, you know when Porky Pig
tries to say "gug-guh-guh- that's all folks", where he works around his inability to say "Goodnight folks" by saying "That's all folks!", right?
Yes a stutter. But in my case it was some type of aphasia, much more of an expressive aphasia than Porky's presumed dysarthria.
I knew what I wanted to say, but before I actually said it I could feel the wrong word coming out, and rather than saying the wrong thing I found some other thing to say. A thing that sometimes made less sense of the context.
A slightly more cerebral version of what Porky was doing, which is ironic since his solution made the same intended point with while I would often fail to make the original point entirely.
As the evening progressed and I mentally workout out in my Manter & Gatz what was happening -- not dysarhria, and not a true Wernick's (receptive) aphasia though with similar impact but not a Brocca's (expressive) aphasis, either.
I eventally I ended up in the emergecy room, as although a stroke or TIAs were unlikely, I was being foolish to not make sure.
And in the end, no bleed or TIA, but the most active MRI in terms of enhancing MS lesions in almost a decade!
And now suddenly all the drugs denied to progressive MS are back on the table -- Copaxone and Tysabrri and the rest.
It's a whole new ballgame, people!
In an ideal world, IDN support will in future be utterly complete.
And I don't just say that as an owner responsible to get people in the company to be supporting IDN. :-)
Luckily, failing to live up to that ideal is not a requirement!
Now conversion to Punycode is obviously a support requirement, but by itself that doesn't help so much.
At this point it would be very helpful to lay out three basic categories of "IDN" requirements and talk about the relative importance of each....
ISSUE 1: Can use services like web sites that people publish which use Internationalized Domain Names:
This is the first and most basic requirement, being able to get to sites that are hosted on domains that use characters beyond the subset of ASCII that used to be required for domains.
Eventually everyone will need to be able to do this if we don't want larger and larger pieces of the Internet to evolve and leave the other parts behind unable to reach anyone.
Most commonly required is web sites, but over time many others can be required....
This is something Internet Explorer (IE) started doing in version 7.0 -- clumsily at first but steadily getting better and better (though without many web sites using IDN it was not as crucial for them or anyone to get better.
In theory others could do it before browsers did the work, but they had to do it all on their own.
The most impressive version of #1 is any application or protocol or platform that can do the steps laid out in part 3: There's no "I" in DIY, either!.
ISSUE 2: Can host services like web sites that can be published to others that use Internationalized Domain Names.
Obviously not everyone is an ISP or service provider or product that cam manage web sites and services that need this, but for those who do this is important.
And this is where a product like Internet Information Services (IIS) comes in, with later versions able to host themselves as whatever site you want them to.
Someone has registered an International Domain Name with an ISP and then they can host it in a product like IIS.
They may also need the ability to get referrer names from addresses and those names may themselves be International Domain Names. But that part is pretty easy on a machine that knows how/when to convert into and out of Punycode....
Stay tuned for the next exciting episode!
A segue -- a wonderful performance of Andrew Lloyd Webber's Don't Cry For Me Argentina (from Evita, sung by Elaine Page):
Okay, this should get you in the mood for today's blog.... :-)
The question was straightforward enough:
Hope you are doing well , I m needing a help with a customer in Argentina, they are using a unantend.xml to install Windows 7, the part of the XML that is showing the issue is pasted below, the customer wants to use the Spanish Argentina Layout on the XML file, so they are putting “es-ar” as User Locale , and UIlanguage , as per my research those are non documented values (ref: Available Language Packs) , so I would like to confirm If I m right ? and if I m right , is the only solution to use “es-ES”. Do we have any way to configure the Layout as Spanish Argentina ?
<InputLocale>0000080a</InputLocale> <SystemLocale>es-es</SystemLocale> <UILanguage>es-es</UILanguage> <UILanguageFallback>es-es</UILanguageFallback> <UserLocale>es-AR</UserLocale>
When we are using the code above, this window is displayed and the customer want to answer that from the XML file, if its possible to select Spanish Argentina from the XML.
Well, as the link to the Available Language Packs indicates, the only Spanish Language Pack we have is es-ES -- there is no es-AR.
So if you put an unknown value in the answer file, you'll be prompted by setup.
Though I'll admit there is some doc confusion -- like in the fact that the <User Locale> info points to the UI Langauage list. The whole area can be confusing enough that fixing the weird links might make sense....
Now as this link indicates, UILanguageFallback only makes sense when two conditions are met:
But of the other values, the InputLocale looks wrong. It is documented as follows:
Specifies the input language and keyboard layout for a Windows installation.
Input_locale can be one of two values:
To use the default input locale for a language, you can specify the language identifier. For example, to use the default keyboard for English (United States) that corresponds with the QWERTY keyboard, you can specify the value en-US.
Specify the locale ID and keyboard layout hexadecimal values. For example, for en-US, use 0409:00000409. The first value (0409) is the locale ID that represents the input language and the second value (00000409) is the keyboard layout value.
If you want to specify more than one input locale to add support for more than one keyboard type, you can specify multiple values separated by semicolons. For example, you can specify <InputLocale>en-US; fr-FR; es-ES</InputLocale> to add support for English (US), French (France) and Spanish (Spain) keyboards. The first value listed is used as the default keyboard.
The valid keyboard layouts that can be configured on your system are listed in the following registry key. HKEY_LOCAL_MACHINE\SYSTEM\ControlSet001\Control\Keyboard Layouts
For a list of the default input locale values, see Supported Language Packs and Default Settings.
So while putting in either
<InputLocale>2c0a:000008a</InputLocale>
or
<InputLocale>es-AR</InputLocale>
is legal, clearly
<InputLocale>000008a</InputLocale>
is not....
The final text in the answer file:
<InputLocale>2c0a:0000080a</InputLocale><SystemLocale>es-es</SystemLocale><UILanguage>es-es</UILanguage><UserLocale>es-AR</UserLocale>
should do much better, and get you this keyboard:
If you use the other syntax, all of the keyboards in the LOCALE_SKEYBOARDSTOINSTALL will be added.
Beyond that, Martin had some great advice to help out in these kinds of situations:
An easy way to find out such a (sometimes complicated!) syntax is to use the Microsoft Deployment ToolKit (MDT 2010).You can use the GUI to select your values and MDT creates a summary page showing the details to enter into the unattend.xml file.
Quite helpful and safes a lot of troubleshooting time.
MDT? It's right here. Good advice!
So I Won't Cry For Thee Argentina, because you can still use es-ES for the Spanish Language Pack, and if you put in the InputLocale properly, you should get what you're expecting. And for all such situations there is a way to find out exactly what you need....
William asked (via the Contact link):
Hi Michael,
I just read your post about font substitution, which is why I know you're likely to have the answer to my question!
I am running Windows XP (English), but I work for a Japanese company so I get a lot of emails in Japanese.
The problem is that "MS Pゴシック" (MS P Gothic) is set as the default font for Outlook on everyone's PC, even if they are non-Japanese and even if they are running the English version of the OS. This appears to be the policy of my company's Help Desk.
Since I can't change their policy, and I certainly can't go around secretly to everyone's machine and change their default font, I started looking at Font Substitution as a possible way to silently replace their horrible MS P Gothic font with a nicer one. (Note that I'm only trying to replace the Latin character set, which looks horrible. The Japanese character set in MS P Gothic looks just fine.)
Is there any way to do this? I tried adding a font substitution entry in the registry as follows, but it didn't work.
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\FontSubstitutes]"MS Pゴシック,0"="Tahoma"
What would you suggest?
Many thanks in advance for your help!
Cheers,Will.
Unfortunately, there is no good answer here for such a global change, beyond really encouraging people to move to Meiryo or Meiryo UI (which is difficult if they're using XP).
There is a real lack of a global system to subvert the meaning of the fonts used when you view email, other than perhaps the settings in Outlook.
Note that these settings can be in different parts of the Outlook user interface in different versions, so you may need to search a bit to figure out where the settings are in your version....
Basically, there are two big changes:
First, make sure you view all email as plain text:
This will largely take the particular font out of the equation entirely.
Then, to be sure you get the font you want, make sure to explicitly set the font for plain text mails:
This should spare you a lot of the Latin text that uses a font you don't want.
Now of course when you replay to email the cat may be out of the bag, and people will see you've dumped their formatting. I only mention this so you won't be blindsided by it later.... :-)
This blog you are reading right now is not about Windows 8.
It is also not about //build/.
Though it is about a bug that could repro there too, under some circumstances....
Some of you long-time readers may recall back in early 2006 when I wrote about Getting all of the localized names of a font, or almost a year later, when I wrote about Getting all of the localized names of a font[.NET].
Basically these blogs cover the support that Windows and .Net have for the various language specific entries in East Asian TrueType and OpenType fonts, contained in name (the Naming Table).
You know, the "feature" I mentioned in East Asian Font Names.... so long ago,
Now in theory none of this should matter.
Since (as I mentioned in East Asian Font Names....) both the English and localized names are registered in Windows, in Win2000 and later.
In fact, any time I am running on English, the Chinese - Taiwan OS, or the Chinese - Taiwan MUI Language Pack, either PMingLiU or 新細明體 can be used, and either will work.
However, do you remember in East Asian Font Names.... when I said
For prior versions of Windows, the only practical solution is to try one font, then if that attempt fails to try the other. Thus, the list. :-)
?
Well, it turns out that advice is still needed, sometimes.
It is still needed when you are using the Chinese - Hong Kong OS.
Or the MUI Language Pack.
In both of those cases, only one of the two font names will work.
At this point, you might be thinking "big deal, Michael!" since it would kind of make a weird kind of sense if (for example) when you ran on Turkish or Tajik or Tamil that for some reason only the English name worked.
I mean, it would hardly be great, taking us back to NT4.
But we lived through it once; we could live through it again.
Unfortunately, that isn't what's going on.
As it turns out, when running on Chinese - Hong Kong (OS or MUI Language Pack), the English name doesn't work -- only the localized name does!
Clearly, this is a bug.
Maybe it's in Fontsetup.inf, whose tortuous logic only gets more complicated for Hong Kong.
Maybe it's in the code behind AddFontResourceEx, that registers the names when the WindowStation starts.
Or maybe it's somewhere else entirely.
Now this whole area of code has one special feature.
It's fragile and no one wants to touch it because they don't want to break anything.
Though I hope someone takes a look.
I don't want to run NT4!
My MS has been getting worse recently.
Just in the last few weeks, I mean.
I think it may be related to the heat, though this is the mildest Seattle summer in years, so it can't just be that.
My finger dexterity is way off, and my balance is even worse than usual. Even around the house.
And I'm having more handrails added in the bathroom (the landlord is pretty flexible about this, it keeps them from needing to move me to an ADA unit or retrofit the one I'm in).
Also, I know that when typing I officially have more typos than correctly typed letters, which has two direct impacts:
Hopefully it is temporary, but I wouldn't ever make book on that, since it doesn't always work out that way.
If you know what I mean.
Anyway, I'm actually working hard on blogs for tomorrow and the day after, covering some holes in our Hong Kong support.
So you can think of today as some filler, and a health update....
So I was asked the other day in response to my blog The example was wrong, but the point of the example was spot on!, and the part of the blog that said:
And note I am assuming that all sixteen of the Arabic locales on Windows all support the exact same spelling of the Hijri month names, even though some have slightly different spelling of the Gregorian month names (and other dialectical differences as well). Maybe this is more reasonable to let slide, even if there are in-county differences in Iraq, Egypt, Libya, Algeria, Morocco, Tunisia, Oman, Yemen, Syria, Jordan, Lebanon, Kuwait, U.A.E., Bahrain, or Qatar. Certainly more than the ones I mentioned earlier.
asking me for more details.
I'll start by giving the generic values for the three Gregorian calendars related to Arabic:
for the first clue:
Ok, got it? Three different ways of looking at the same month name.
Then, when you look at the LOCALE_SMONTNAME* values that represent the localized Gregorian month names for each locale, you may be able to see the pattern:
You now see a bunch of things:
The ar-IQ, ar-SY, ar-JO, and ar-LB locales generally use the CAL_GREGORIAN_ARABIC style names.
While ar-DZ and ar-TN are mostly using the CAL_GREGORIAN_XLIT_ENGLISH style Arabic names.
And ar-SA, ar-EG, ar-LY, ar-OM, ar-YE, ar-KW, ar-AE, ar-LB, and ar-QA seem to be using the Gregorian transliterated French style Arabic name.
The odd man out is ar-MA, which has differences from all of them for some months.
And you can likely spot other similarities and differences yourself among various months as well (more if you can read Arabic obviously, but even if not you can probably see differences with the shapes).
The end result is of course that beyond the three big categories that would represent three different ways of spelling the month names, there are other differences as well.
Perhaps these are bugs simply never reported by any customer.
Though more likely they are mostly some random yet specific known differences between countries, for the host of reasons that things can be spelled differently for the same language in different areas.
Thinking back to The example was wrong, but the point of the example was spot on! a moment, it's hard to imagine that among all 16 countries there are not also any differences among the Hijri month names too.
Which, if true, would mean that in addition to the many non-Arabic language and in some cases Arabic script locales that use the Hijri calendar and are forced to use the Arabic language, they may even be some Arabic language instances that might have differences in the Hijri month names too?
It may be one of the reasons there are different Arabic locales. Though in the case of the Hijri issue (if it exists) we're not covering it?
Anyone who knows the Arabic language want to comment on some of the other differences?
Over in the Suggestion Box, Matthew Oakley asked:
Hi. I am using Ms Access as a front end to a mysql database for a web testing system. As far as I can tell, everything is set to use utf-8. However, when I type the korean character which is unicode c548 (using IME), it ends up stored in the database as ec95. When I type the same character in the webpage, it is stored in the same database as bec8. I have no idea what is going on! English is stored perfectly, ime seems to be creating a nightmare. Could you make a topic explaining what IME does in depth?
Interesting.
The character in question, U+c548, is also known as HANGUL SYLLABLE IEUNG A NIEUN.
This one:
안
Now the bytes in UTF-8 for this character would be:
EC 95 88
Notice the first two bytes.
They look a lot like the two bytes Matthew said were ending up in the database.
I don't know aabout you, but I'm wth Leroy Jethro Gibbs -- I don't believe in coincidences!
Now the IME works just fine (you can get the letter in question by typing D K S or IEUNG A NIEUN that is interpreted as ᄋ ᅡ ᆫ , which gets assembled the right way on any version of Access).
I just tried it here.
But something else is at work here, something that is corrupting the text....
Maybe something along the lines of custom processing code?
Or the data layer? Access is not a paragon of UTF-8 support.
It isn't the IME....