Blog - Title

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!
  • Sorting it all Out

    Sometimes MMM is MMMM, other times MMM is M! (aka Not all abbreviations are created equal)

    • 2 Comments

    There are several LCTYPE values suitable for use by GetLocaleInfo and GetLocaleInfoEx that act as abbreviations.

    Like the seven LOCALE_SABBREVDAYNAME* values representing the days, which are documented as:

    Native abbreviated name for [Day]. The maximum number of characters allowed for this string is 80, including a terminating null character.

    Or like the twelve LOCALE_SABBREVMONTHNAME* values representing the months, which are documented as:

    Native abbreviated name for [Month]. The maximum number of characters allowed for this string is 80, including a terminating null character.

    And there are helpful doc topics like Day, Month, Year, and Era Format Pictures that say things like:

    d Day of the month as digits without leading zeros for single-digit days.
    dd Day of the month as digits with leading zeros for single-digit days.
    ddd Abbreviated day of the week as specified by a LOCALE_SABBREVDAYNAME*value, for example, "Mon" in English (United States).
    dddd Day of the week as specified by a LOCALE_SDAYNAME* value.
    M Month as digits without leading zeros for single-digit months.
    MM Month as digits with leading zeros for single-digit months.
    MMM Abbreviated month as specified by a LOCALE_SABBREVMONTHNAME* value, for example, "Nov" in English (United States).
    MMMM Month as specified by a LOCALE_SMONTHNAME* value, for example, "November" for English (United States), and "Noviembre" for Spanish (Spain).

    Very helpful, right? :-)

    Well, they might have done a few abbreviated other language examples, but I suppose we can guess what they might be. No harm, no foul....

    Any developer can wrap their head around these examples and this information.

    Go for it, developers!

    Oh, wait -- before I forget, there are some problems with going forward with this knowledege here.

    Like the not insignificant number of locales that don't have abbreviated names that are any different than the regular month names.

    So (for example) many locales will have a LOCALE_SABBREVDAYNAME5 that is identical to LOCALE_SDAYNAME5, a LOCALE_SABBREVMONTHNAME10 that is identical to LOCALE_SMONTHNAME10. And so on.

    They don't want to mention that? Seems like oit ought to be in there somewhere....

    It may certainly put a cat among the pigeons for some unususpecting developers!

    Oh yeah, and it gets worse.

    Like let's look at the Japanese abbreviated month names:

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12

    Um, wait.

    Isn't that what M (Month as digits without leading zeros for single-digit months) is supposed to be for? That's just unexpected.

    We do not' document that either -- even though we've been doing it for years!

    And there are other strange examples, too. Perhaps you have a favorite?

    There is of course what we do for zh-CN abbreviated and full month names:

    1月 一月
    2月 二月
    3月 三月
    4月 四月
    5月 五月
    6月 六月
    7月 七月
    8月 八月
    9月 九月
    10月 十月
    11月 十一月
    12月 十二月

    Just keep in mind in that 月 is "month" and you are pretty much reading Chinese now, right? :-)

    Now I'm not saying it's wrong -- it is right, but it is odd if you are not Chinese to think of the difference here as being abbreviated versus full month names.

    Remember what we have for the Japanese abbreviated month names? We have the same thing for Korean -- oh, and also Czech and Slovak, too!

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12

    I am fully prepared to assume that this different approach might be expected in East Asia.

    But that it might also be in Slovakia and the Czech Republic too? That's new information!

    It is hardly my first intuitive guess that those are the expected abbreviations for these Czech and Slovak month names, right?

    leden január
    únor február
    březen marec
    duben apríl
    květen máj
    červen jún
    červenec júl
    srpen august
    září september
    říjen október
    listopad november
    prosinec december

    Now all I know is that if I am a developer I have a somewhat foggy notion of what MMM means.

    Any developer or designer trying to create a calendar might.

    But for many locales, it clearly ought to be documented in some conceptual topic that I should be ready to assume that MMM might be MMMM. Or even M!

  • Sorting it all Out

    The evolving Story of Locale Support, part 17: Today I feel like translating you more than before

    • 4 Comments

    Apologies to Grace Slick for the title riff!

    Previous blogs from this series:

    So, back when I was writing about The Locales of Windows 7, all divvied up and The Locales of Windows 7, divvied up further, there is an important point that it appears it is all too easy to forget about.

    "What fact, Michael?" you may ask.

    I mean, not because youy think I need the prompting, but because you know when I get on a roll I just assume people will be caught up in it too!

    Anyway, I'll explain.

    You have that very first table of "Language Pack" locales I mentioned in the "divvying" blog -- fewer than forty of those.

    Then that second table of "Language Interface Pack" locales -- add another 60 or so, each of which will localize fewer strings than the first group, but are as good as they can be under the circumstances.

    And then that third table that is just locales -- none of it localized at all, but with all that underlying data we store so that people who know or use or love those languages can benefit from what we do have.

    Now in an ideal world of:

    • infinite time;
    • infinite money;
    • infinite expertise.

    we can imagine if every single locale was in the first table, and nothing needed to be divvied up at all.

    However, we do not live in such a world.

    This gives us two choices:

    • We can dump support for everything in Table 3 and tell you "tough luck" if you know or use or love one of them, since we only support languages we translate into, or
    • We have to have two separate concepts: one for the user locale, one for the UI language.

    Obviously we have chosen the second option. :-)

    Of course that doesn't stop people from fighting the idea.

    Sometimes even Microsoft employees who add features that use locales!

    For example, let's take a quick look at the MonthCal -- the MonthCalendar control, the MonthView ActiveX control. They are all based on the same thing.

    Here is an example showing a few of them in design view:

    The control is based on LOCALE_USER_DEFAULT.

    The entire control.

    Well, with one exception.

    The word TODAY, which is localized, and thus based on User Interface Language!!!!

    People will then complain (as owners or uses of the feature who try to mix and match the two settings and see the effect on the control) to be really non-intuitive.

    And they loudly proclaim: "Can't we just have one setting? This is broken!"

    Um, no.

    Sorry.

    Maybe this use of  the User Interface Language that has the [possibly localized] Today is broken. But that's their feature.

    And that is hardly the fault of the user locale. Or the UI Language. Or the data supporting either one. This is just a bit of a feature that doesn't scale to the product it is sitting on.

    It is a simple design flaw in this piece of UI, the owners of which have no one to blame but themselves for the bug!

    So they are responsible for doing something here, or not.

    To put it simply (if rudely):

    We can't dumb down our locale support just because they dumbed down their feature!

    Maybe next time, they'll formally request a feature to add such strings to the locale data. Maybe the "Today", maybe something else like it.

    And maybe they'll do it with enough time to actually in fact let us get the data.

    Or, if history of the last ten versions of Windows acts as a guide, they'll drop the issue, and bring it up next version, when it is too late once again. :-/

  • Sorting it all Out

    Changing the world, 0.1 steps at a time!

    • 4 Comments

    Unicode 6.1 has been officially releasesed!

    From the release mail:

    *Mountain View, January 31, 2012*. The Unicode Consortium announces the
    release of Version 6.1 of the Unicode Standard, continuing Unicode's
    long-term commitment to support the full diversity of languages around
    the world. This latest version adds characters to support additional
    languages of China, other Asian countries, and Africa. It also addresses
    educational needs in the Arabic-speaking world. A total of 732 new
    characters have been added. For full details, see
    http://www.unicode.org/versions/Unicode6.1.0/.

    This version of the Standard also brings technical improvements to
    support implementers. Improved changes to property values and their
    aliases mean that properties now have easy-to-specify labels. The new
    labels combined with a new script extensions property means that regular
    expressions can be more straightforward and are easier to validate.

    Over 200 new Standardized Variants have been added for
    /emoji/ characters, allowing implementations to distinguish preferred
    display styles between text and /emoji/ styles. For example:

    26FA FE0E         U+26FA+U+FE0E/         TENT text style
    26FA FE0F         U+26FA+U+FE0F/         TENT emoji style
    26FD FE0E         U+26FD+U+FE0E/         FUEL PUMP text style
    26FD FE0F         U+26FD+U+FE0F/         FUEL PUMP emoji style

    Among the notable property changes and additions in Unicode 6.1 are two
    new line break property values, which improve the line-breaking behavior
    of Hebrew and Japanese text. Segmentation behavior was also improved for
    Thai, Lao, and similar languages.

    Two other important Unicode specifications are maintained in synchrony
    with the Unicode Standard, and have updates for Version 6.1. These will
    be finalized in February:

        * UTS #10, Unicode Collation Algorithm
        * UTS #46, Unicode IDNA Compatibility Processing
        *

    You can check it out at the Unicode website, here!

    The rest of the content of this blog you are reading has been redacted....

    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
    ████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

  • Sorting it all Out

    Sometimes things are extended in the wrong direction....

    • 0 Comments

    SQL Server's code page, collation, casing, locale, and resource model are all direct attempts to extend the things that Windows provides in ways that make sense for SQL Server.

    This sentence bears repeating, I think. Because it seems (in the words of the late, great George Carlin, vaguely important.

    SQL Server's code page, collation, casing locale, and resource model are all direct attempts to extend the things that Windows provides in ways that make sense for SQL Server.

    Okay, I feel better now.

    Now generally speaking I find the model to be supportable -- for example the way they snapshot the casing table in a given collation version rather than relying on the OS table (which could have disastrous compatibility concerns as they cover multiple versions of collations).

    To be honest, even in those cases when I don't agree with what they do at all -- like the way they split user locale/UI language/formatting/collation differently. Or the way they combine "redundant" collations that can cause geopolitical discomfort/technical challenges -- I still respect their choices because, after all, the own the choices and the consequences thereof.

    There are two problems that I really think they should fix though, in some future version of theirs:

    The fact that the Hosted CLR (aka SQLCLR) doesn't use the SQL Server collations of the server they are on, which truly limits the usefulness of managed code in stored procedures, since you can't code against the server's own behavior.

    The fact that SQL Server makes its "server collation" setting based on a DEFAULT_SYSTEM_LOCALE analogue instead of a UI language of the LOCALSYSTEM account analogue, which blocks collations that are from "Unicode only" locales -- e.g. Hindi -- from working as sever collations.

    There is no good reason for the limitation itself -- what better reason is there to remove it? :-)

    More about this setting here....

    Maybe it's our fault; we did, after all, call it DEFAULT_SYSTEM_LOCALE.

    I guess they just were just extending our naming mistake of yore....

  • Sorting it all Out

    An SDK for the OSK? No way. Though if I may, I'll just say...

    • 2 Comments

    So the other day, I was contacted by hal hubschman, with a message.

    Kind of a request for information.

    It went like this:

    Subject: on screen keyboards

    i read a number of your blog entries which i found informative.  i have been trying to find a osk sdk with no luck.

    do you know if one exists ?

    thank you for the blogs you have created

    hal

    Hmmm.

    I usually hate questions like this.

    Because they force me to answer with a rather hopeless negative.

    I mean, since there is no public Software Driver Kit (SDK) for any version of the On Screen Keyboard (OSK) shipped with any version of:

    • Windows (including the new one in Windows 8)
    • Office
    • Tablet PC

    whatsoever.

    I hate that - very frustrating, why even cover this one in a blog?

    But then, after writing the above, I looked over next to me.

    To my Amazon Kindle.

    On the screen?

    The book I had just finished re-reading: Ender's Game, by Orson Scott Card.

     And I suddenly wondered whether if Ender Wiggins expressed such a depressing thought, if Bean wouldn't push him past this seemingly unbreakable wall.

    As a by the way, and not for nothing, for those of you have never read Ender's Game, by Orson Scott Card, you're truly missing a really good book. You should look into it!

    I suddenly realized I was looking at this the wrong way.

    Each of those On Screen Keyboards, those Soft Keyboards, were designed to wrap the various keyboards provided by Windows and kbdtool.exe, by MSKLC and kbdutool.exe.

    So perhaps there was no SDK to let anyone control any aspect of any version of the OSK.

    But you could, through MSKLC, control any exposed aspect of the OSK's behavior anyway!

    Okay, this is not exactly the answer hal hubschman was perhaps hoping for -- maybe he wanted a way to extend an actual OSK, any OSK.

    The only supported way to do that, however, is to create one's own OSK.

    That would take a bunch of blogs worth of knowledge, though a lot of them technically already written....

    It would be a huge effort to try to string them all together like jigsaw puzzle, filling in the gaps representing missing blogs, though not completely impossible.

    Just highly improbable!

    For now I'll hope that my other answer is sufficient for hal hubschman's question. :-)

  • Sorting it all Out

    If font linking doesn't fit the text to a T (or ț!), a Romanian letter may be right but not quite look it

    • 5 Comments

    I've talked about font linking in a bunch of different blogs over the years, such as:

    and so on. Many of them make a point implicitly that I am going to make quite explicit today.

    And that point is simply The thing that sucks most about GDI font linking is the way it mixes fonts that don't look right next to each other.

    Anyway, the other day colleague and coworker Laura passed on a question people were trying to get a handle on.

    Basically, it was a Romanian product that had strings looking something like this:

    The Ts don't all seem to line up here!

    Let me blow that up in case your browser doesn't make it bigger:

    The Ts don't all seem to line up here, but bigger!

    Now they were claiming to be using Segoe UI.

    And for whatever it's worth, the glyphs of most of the characters look to my amateur eye to be somewhat Segoe-like, perhaps. But not exactly.

    Remember that at one point, there were many more fonts floating around that were missing these characters.

    Perhaps a forensic typographer could do better, a-la İ şéè đêäđ ķéÿš etc.

    But do you see the fourth letter, which is clearly U+0074, aka LATIN SMALL LETTER T, versus the sixth letter, which is clearly U+021b, aka LATIN SMALL LETTER T WITH COMMA BELOW?

    Well, Segoe UI does not look this weird with these Latin letters alongside each other.

    In fact, here is just about every S and s and T and t in Segoe UI, right next to each other:

    Comparig all the Ss and Ts in Segoe UI -- all the same size.

    None come even close to being this incorrect looking....

    For the sake of good looking Romanian text, at least!

    Thankfully, they're going to fix this one, before a negative entry would need to be added to lists like The history of messing up Romanian on computers. If you know what I mean....

    Though even if not, we've come a long way banishing the cedillas from Romanian text, right? :-)

  • Sorting it all Out

    3 x 7 can be a lot more than 21,sometimes!

    • 2 Comments

    So we currently have the Building Windows 8 (An inside look from the Windows engineering team) Blog.

    It is localized into French, aka Français.
     
    And it is localized into German, aka Deutsch.
     
    And it is also localized ino Brazilian Portuguese, aka Português (Brasil).
     
    We didn't skip localizing into Korean, aka 한국어.
     
    And it was also important localizing into Japanese, aka 日本語.
     
    We didn't ignore localizing into Chinese, aka 简体中文.

    And last but not least we localize into Russian, aka Русский.

    As a Blog being localized, the project goals are clear.

    The blog will build interest and enthusiasm among developers, IT Pros, and enthusiasts for the next version of Windows. It is the engineering blog for Windows 8, and the content is provided by Steven Sinofsky and the Windows engineering team.

    For the WWLI content localization team, our goal was to accomplish the following:

    • Build interest and enthusiasm among international developers, IT Pros, and enthusiasts for the next version of Windows by providing localized blog content 
    • Provide parity and consistency of message across the languages selected for localization 
    • Release the localized B8 blog in 7 languages (DE, FR, KO, JA, PT-BR, RU, ZH-CN) no later than 2 business days after the English blog.
    • Build a sustainable, scalable, and cost-effective process for blog localization

    We are now well over 45 blogs in., and feedback has been  really good.

    But we didn't stop there.

    We also have a Windows Store for developers Blog, and we have started doing the localization for the very popular IEBlog as well.

    And each will be localized into those same seven languages!

    Quite a grid we'red building:

    Building Windows 8
    An inside look from the Windows engineering team

    Windows Store for developers IEBlog
    French Conception de Windows 8
    Vision en coulisses de l'équipe d'ingénierie Windows
    Blog Windows Store pour les développeurs IEBlog Français
    German Die Entwicklung von Windows 8
    Einblicke in die Arbeit des Windows-Entwicklerteams
    Windows Store-Blog für Entwickler IEBlog Deutsch
    Brazilian Portguese

    Criando o Windows 8
    Nos bastidores com a equipe de engenharia do Windows

    Blog da Loja do Windows para desenvolvedores IEBlog Português
    Russian Создание Windows 8
    Взгляд изнутри от группы разработчиков Windows
    Магазин Windows: блог для разработчиков IEBlog Русский
    Japanese Building Windows 8
    Windows エンジニアリング チームによるブログ
    Windows Store 開発者向けブログ IEBlog 日本語
    Korean Windows 8 빌드
    Windows 기술팀 내부 모습
    개발자용 Windows 스토어 블로그 IEBlog 한국어

    simplified Chinese

    Building Windows 8
    来自 Windows 工程团队的内部视点
    面向开发人员的 Windows Store 博客 IEBlog 简体中文

    Nothing short of Amazing!

  • Sorting it all Out

    The evolving Story of Locale Support, part 16: We can't scale to a Xishuangbanna Dai locale, but…

    • 2 Comments

    Previous blogs from this series:

    This series has been largely discussing a particular "meta-issue".

    The fact that as our model for locale support is indeed evolving. And much more quickly than usual.

    Some of the blogs in this series capture the "missing links", which can be invaluable since not everything can be deduced from a finished product.

    Examples?

    Part of it is the new keyboards that can support languages for which no locales currently exist, as described in part 2.

    And part of it is in the new list of languages that sits under those new keyboards and supports way more than our locale list can perhaps ever reach, as described in part 15.

    Exciting times, aren't they? :-)

    Well, let's add one of those new keyboards.

    Like the one for the New Tai Lue script/Xishuangbanna Dai language:

    Adding the Xishuangbanna Dai (New Tai Lue) keyboard layout

    Cool!

    Even cooler -- how quickly I typed ᦎᦷᦑᦺᦜᦺᧈ ᦉᦲᧇᦉᦸᧂᧅᧃᦓᦈ with the Windows 8 soft keyboard. :-)

    Admittedly I built the original keyboard layout it was based on -- your mileage may vary....

    If you have the Developer Preview you can see how we are improving here already in supporting finding new language names via some of those script names!

    It's on our Language List now, and everything!

    The updated Windows 8 language list, with that New Tai Lue nt the bottom

    This is awesome!

    But then we ran into a problem when we tried to search for some New Tai Lue script/Xishuangbanna Dai language text in an XPS file we just created.

    Because we were using that New Tai Lue keyboard, the one that the "WinLangDB" list put under the code of khb, for the Lü macrolanguage.

    The search ends up failing since functions like FindNLSStringEx can only handle supported locales by NLS rules, not WinLangDB rules. This is no problem in Notepad which users the default user locale, but a big problem for anyone that tries to be more clever than that.

    In a way, its surprising in s way that it wasn't found earlier. I guess there aren't too many things using the clever way -- we should maybe have more of them!

    Obviously this is a small mis-step in the bold move forward to support things that we don't fully support, and we'll have to figure out what to do here.

    In fact, people are looking at this right now. Since NLS supports all of the underlying characters in the default table, there are lots of possible solutions (note that if nothing were done then you couldn't even search for English text since an "invalid" locale name makes the NLS functions fail!).

    I mean, since we lack the resources at this point to add a Xishuangbanna Dai locale. :-)

    Of course this is much bigger than New Tai Lue -- the WinLangDB list supports such a huge subset of valid ISO-639-3 codes that doing nothing would hurt even more than just this one case!

    But truthfully I'm not worried here -- 15+ steps forward and one step back in the pre-release has plenty of time to be changed to 16+ steps forward in the actual release.

    Alternately, if they don't fix it then it will make a great KB article, maybe. :-)

    And either way, I'm proud to be a part of those step in our evolving Story of Locale Support....

  • Sorting it all Out

    You can do CESU-8 if you need to; we went in a slightly different direction....

    • 6 Comments

    Regular reader Dan asked me via the Contact link:

    We just upgraded our customer desktops from Windows 2000 to Windows 7, and we're seeing a major break in our text processing app.

    We've debugged the problem pretty thoroughly, and it doesn't look it's our app at all. Notepad seems to be breaking our Plane 1 and Plane 2 text. Which seems like it must be impossible, isn't support of supplementary characters a Windows 7 feature?

    I'll admit I was confused at first, though the problem he described seemed kind of familiar.

    And then it came to me.

    The app was basically supporting supplementary characters on Windows before we really were.

    Well, that isn't entirely accurate.

    But the was a weird time before XP shipped that we were okay with the six byte form for supplementary characters, before Unicode got more explicit about considering it to be ill-formed and before we started conforming to Unicode's stricter definition....

    Dan's Line of Business app was essentially using CESU-8, not UTF-8. And given the weird difference between how Notepad initially detects UTF-8 and how it converts the data -- described previously in blogs like (It wasn't me) -- and the solution becomes clearer.

    • Either move to the 4-byte form of UTF-8 supplementary characters, or
    • Do all off the conversion outside of Notepad and Windows!

    Personally, I'd recommend the former option -- the latter is kind of contrary to what Windows, Microsoft, and Unicode are doing these days.

    Though if an application has a heavy investment in the 6-byte form, then as long as it is kept internal to the app (or properly marked when communicating with those who understand it), it isn't the end of the world....

  • Sorting it all Out

    It is *not* called the Desert Desert, dammit!

    • 16 Comments

    You probably know people, or work with people, or are one of those people, who calls it an "ATM machine".

    And you may know people, or work with people, or are one of those people, who calls it a "Light Emitting LASER".

    Hopefully you laughed when the mobsters from the movie Mickey Blue Eyes noted the sign on the restaurant "The La Trattoria", despite the fact that the meaning of the phrase was "The The Trattoria", as a sign of irony at he faux intellectual bravado leading to ignorance of a proper name of a fancy FrenchItalian restaurant.

    Perhaps you were around to chuckle with Mike or Cathy or I when a particular goup admin used to refer to "Very VIP people", realizing the meaning of "VIP" inlude the word being repeated.

    Many people I know found it mildly ironic that natives and former natives of Iran wanted their language فارسى (spelling AFEH ALEF REH SEEN ALEF MAKSURA) called Persian rather then Farsi, even though the native name was [pronounced FARSI by those same native speakers.

    And yet, I do not want to know if you make this other mistake.

    I am going to say La La La I am not listening to you yet you ar still talking if you try to confess you are guilty of it.

    And after you have read this blog today you will be able to point out the mistake any time someone else makes it.

    Anytime you see Google Maps refer to locations as being in the "Sahara Desert" like so:

    Because the word for Desert in Arabic is

     صحراء

    aka ARABIC LETTER SAD + ARABIC LETTER HAH + ARABIC LETTER REH + ARABIC LETTER ALEF + ARABIC LETTER HAMZA

    aka SAD HAH REH ALEF HAMZA

    aka SAHARA.

    Because you know it is silly to call the sandy, hot region in Northern Africa the Desert Desert.

    You aren't that silly, right? :-)

  • Sorting it all Out

    …wondering about Paul: where he could be, who he's with, what he's thinking, and if he'll ever return someday…

    • 9 Comments

    Cue somewhat gratuitous Hotel La Rut video:

    At least I know Paul was thinking of me, so I don't have to wonder about that part!

    You see, the other day, my friend (and former colleague from Microsoft) tweeted to me over Twitter:

    Ah, a somewhat under-documented bit of info, that.

    It involves LPKSETUP.EXE.

    Let's try running it:

    A nice dialog pops up, whether Vista or Windows Server 2008 or Windows 7 or Windows 2008 R2.

    It looks like this:

    Now obviously we want to Install display languages (if anyone wants to look into either the Uninstall display languages or How do I get additional display languages? options, they can do so, of course!).

    Anyway, once you choose to Install display languages, you'll see the option to Choose your method of install:

    That Launch Windows Update option may be of some interest, in other circumstances. After all, it points out the other way to find Language Packs.

    But for now we'll stick with Paul's scenario, and Browse computer or network. How better to Locate and install display languages manually, anyway?

    Here we go:

     

    Okay, I'll now explain a little bit of how this dialog works.

    You choose a directory.

    And then it will traverse that directory and its subdirectory (and no further subsubdirctories!) to look for Language Packs or Language Interface Packs!

    I took the trouble to copy 16 different lp.cab files to my local Windows 7 machine, in a directory structure I have never personally witnessed:

    Now when I browse to the subdirectory, the magic happens.

    Well, I guess I can't call it "magic" just like I can't call a card trick "magic" if I have to manually change the order of the deck in front of you to make the trick work!

    Anyway, you'll see it detected a bunch of valid lp.cab files:

    Only 12 of them are valid for my Windows 7 x64 machine; the other 4 are Windows 7 x86 machines -- though they could have also been other Windows version Language Packs/Language Interface Packs;they'll all fail here.

    I suppose this explains why LPKSETUP.EXE doesn't traverse any deeper -- since it is taking the time to open up every CAB file (I renamed several of them to be sure) and see if its valid. Traversing too deep could potentially start to get painful!

    Note to CSS: A Microsoft Knowledge Base article aimed primarily at IT folk/system builders explaining how to properly make use of LPKSETUP.EXE to allow any language selection the IT folk/system builders choose to make appear to their users (with the proper steps to work properly with secured desktops) may be the most completely awesome-est KB article of the year! Any takers?

    Okay, so that's it. Hopefully it will answer Paul's question.... :-)

    I wonder what he's been up to. who he's with, what he's thinking (using Vista?!?), and if he'll ever return someday. Don't you?

    So I'm just wondering about Paul -- what he's been up to, who he's with....

    I suppose I am in a La Rut myself!

  • Sorting it all Out

    Avoiding the Snowpocalypse!

    • 1 Comments

    I originally had a seat on the 8:30am flight to Las Vegas.

    and then God apparently said no, not so much.

    She made plans to dump up to a foot of snow or more on Seattle on Wednesday.

    Crap.

    Alaska Air was a bit short-sighted, and they charged me $100 for the change, but I decided to follow in the footsteps of brave Sir Robin:

    Brave Sir Michael ran away
    Bravely ran away.
    'Ere mounds of snow dumped upon his head, he bravely turned his tail and fled... 

    I'll try to blog something substantive tomorrow.

    Though I can't guarantee it will be as entertaining!

     

  • Sorting it all Out

    The evolving Story of Locale Support, part 15: Fixing our listings up in Windows 8!

    • 5 Comments

    Previous blogs from this series:

    Windows support of locales, and in fact the whole locale model in Windows is impressive.

    It's substantial.

    And....

    It's as confusing as all get out!

    I mean, even almost seven years ago when I wrote What is my locale? Well, which locale do you mean? to list and define all the different kins of locales in Windows:

    • DEFAULT USER LOCALE (Windows XP term: "Standards and Formats")
    • DEFAULT SYSTEM LOCALE (Windows XP term: "Language for non-Unicode Programs")
    • DEFAULT USER INTERFACE LANGUAGE (Windows XP Term: "Language used in menus and dialogs")
    • DEFAULT INPUT LOCALE (Windows XP Term: "Default Input Language")

    I was self-consciously aware of how confusing everyone found all this.

    Now virtually everyone I talked too agreed that each term was entirely explainable, especially in Windows XP and later when they were each given new terms that didn't use the same word LOCALE over and over again.

    But the only ones who were willing to call this motley crew intuitive were completely and totally high at the time.

    And I'll be honest, the ones unwilling to call it intuitive were right.

    The model, as expansive and feature-filled as it may be, is incredibly confusing.

    The previous changes aimed at incrementally improving terminology were perhaps worthwhile, but ultimately unable to solve the real problem.

    Until Windows 8....

    Now first they take the old Regional and Language Options:

    (shown here from Windows 7) to start.

    Now instead of one Control Panel Applet, there are now two. in the Control Panel:

    One for Region, with just three tabs, none of which say Language:

    and one new one, for Language:

    Now this Language Control Panel Applet is for User Interface Languages (if they are installed), for another language specific services (if they are installed) and for Keyboard Layouts (whether atop actual hardware keyboards or soft keyboard layouts).

    You can see the new Keyboard List right here - notice the order is the same as from the Language applet, above:

      Now this does start to thin the herd in a more meaningful way.

    Though speaking for myself it is an odd direction when you consider that both the Formats list is configured over the Region applet, and the Keyboards list is configured in the Language applet, and that both of their built-in lists are considerably larger than User Interface languages or any other services.

    Though now with changes like the ones described in part 2 (raising the roof on keyboards), the keyboard list is now no longer completely limited to "supported locales" that populate the Formats List anyway.

    So perhaps my concerns about the mode of disconnect are unwarranted. :-)

    I will conditionally consider this to be a good evolutionary step that will simplify setting up Windows for typical users -- whether chnging UI language, adding keyboards, or whatever.

    In the long run, I think the direction here will only get better and better over time.

    Now in future parts, I'll dig in further here, looking at programmatic new means of getting information....

  • Sorting it all Out

    I'm reasonably certain that those who disagree with me here are wrong!

    • 4 Comments

    So, the other day, I wrote How to detect if a locale is Bidi, Windows 7/8 edition.

    This is a topic I had covered a bunch of times over the years, in many prior blogs, from How To [NOT] detect that a locale is bidi to How To detect that a culture is bidi to Cue the smarter version of GetDateFormat... ok, it's a wrap! and so on.

    Most of the Win32 answer prior to the introduction of LOCALE_IREADINGLAYOUT was using the LOCALESIGNATURE.

    More specifically, bits 123, 124, and 125 of the Unicode Subset Bitfields:

     Bit  Meaning
     123  Windows 2000 and later: Layout progress, horizontal from right to left
     124  Windows 2000 and later: Layout progress, vertical before horizontal
     125  Windows 2000 and later: Layout progress, vertical bottom to top

    The combinations of different values of these three bits make the description of almost any text directionality outside of Boustrophedon (or Rongo-Rongo) possible:

     Bit 123  Bit 124  Bit 125 Text Rendering Direction IREADINGLAYOUT equivalent
    0 0 0  LeftToRight, then TopToBottom 0
    1 0 0  RightToLeft, then TopToBottom 1
    1 1 0  TopToBottom, then RightToLeft  2
    0 1 0  TopToBottom, then LeftToRight 3
    0 0 1  LeftToRight, then BottomToTop n/a
    1 0 1  RightToLeft, then BottomToTop n/a
    0 1 1  BottomToTop, then LeftToRight n/a
    1 1 1  BottomToTop, then RightToLeft n/a

    Now as the last column hints at, the four reading layout choices we support are all completely able to be derived from the LOCALESIGNATURE bits.

    The additional four rendering options theoretically able to be captured by these bits but not available to the new flag are not used as as primary rendering for any language we support.

    A part of me wishes both

    were entirely derived from these three bits, since doing that directly only satisfies my inner database developer that hates storing repetitive data in multiple places.

    Of course in a mature society, there is room for disagreement, but in this case I'm reasonably certain that those who disagree with me here are wrong. :-)

    Since had we done it sooner, it might have prevented us from shipping managed code bugs like the one described in It's not right when IsRightToLeft is wrong, and native code bugs like the one described in Double Secret ANSI, part 2 (the brokenest one yet, sorry 'bout that!).

    Because the best way to make sure the data is correct is to use the data.

    Early and often, as both those bugs that made it to shipping products prove quite effectively (to our detriment at the time).

    To be perfectly honest, I wish we would make this change even now, because we will always consider any differences between these three different items as a bug, as the best way to make sure that they don't fall out of sync is to use one source for all of them.

    We could in theory make this change later this week to the data behind the properties.

    Now I am an owner of the data, but this would also be code to change (in multiple products across multiple divisions). I can appeal to the owners to fix the long-term problem sync though.

    Before that, we we can even fix the problem I mentioned the other day in How to detect if a locale is Bidi, Windows 7/8 edition, where we stop returning results that are incorrect 99% of the time (claiming verticality for CJK and Mongolian), by default...

    Technically, I could have used that idea and made this another part of the "The evolving Story of Locale Support" series, but I'm not confident that everyone will agree, so who knws whether we'll evolve that way, yet!

  • Sorting it all Out

    How to detect if a locale is Bidi, Windows 7/8 edition

    • 1 Comments

    The other day, I was forwarded a question by a colleague.

    They wanted to know why the following cod was not letting them detect Bidi properly:

    UINT ret = GetLocaleInfoEx(LOCALE_NAME_USER_DEFAULT, LOCALE_IREADINGLAYOUT, Layout, ARRAYSIZ(Layout));
    if(ret && (Layout[0] != L'0')) {
        // Treat the locale as Bidi: WARNING: THIS CODE IS WRONG!
    }

    Now this code is bad for several reasons.

    First of all, let’s look at the doc topic for LOCALE_IREADINGLAYOUT:

    Windows 7 and later: The reading layout for text. Possible values are defined in the following table.

    Value

    Meaning

    0 Read from left to right, as for the English (United States) locale.
    1 Read from right to left, as for Arabic locales.
    2 Read vertically from top to bottom with columns to the right, and read from right to left, as for the Japanese (Japan) locale.
    3 Read vertically from top to bottom with columns proceeding to the right, as for the Mongolian (Mongolian) locale.

    Now obviously code that assumes everything other than 0 is RTL is going to have problems.

    But okay, that is an easy fix.

    There is a deeper problem, one that is a bit more insidious….

    Let’s start by looking at what locales fall into each category:

    Value Locales that have this value
    0 English, Russian, Thai, etc.
    1 Hebrew, Arabic, Persian, etc.
    2 Japanese, Chinese, Korean
    3 Mongolian

    Now in practice, vertical support in applications is such that the majority of text in Mongolian, Japanese, Chinese, and Korean should be treated as if it was in that first category slong with English. Therefore, returning 2 or 3 is a kind of unrealistic idealism, which ultimately makes the code even more flawed than previously thought!

    There is more to the story beyond this one buglet, and I’ll be talking about this further, soon….

Page 1 of 242 (3,622 items) 12345»