Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
Someone asked me yesterday if there were lists for:
Well, they knew that #1 and #2 were probably somewhere, but they did not know about #3.
I was a little bored, so I assembled the lists for Windows 7.
A few caveats:
So here are a bunch of lists:
Table #1: The locales representing languages into which Windows 7 localizes:
Table 2: The locales representing languages for which Windows creates Language Interface Packs, aka LIPs:
Table #3: Locales whose identifiers are not directly associated with any localizations of Windows, even if a related identifier might make for one representing a suitable localization:
Enjoy!
I'm a little confused by table 3. In what sense does cy-GB and es-AR "exist" and cy-AR and nv-DK (my evergreen example) "not exist"? Only in the sense that the first has a numeric LCID and the second does not? Or also in the sense that some cultural information is available for the former but not the latter?
They exist in the sense that they are defined on every version of Windows 7 out of the box, with need to neither define nor install anything.
And also the LCID thing, though I care less about that these days....
I suppose that since English (United Kingdom) is in Table 3 that means the control panel must talk about color (instead of colour). XP certainly does. This goes to show how little the differences are in OS terms.
It's always blindingly obvious in Office products as soon as the spell checker starts complaining that I don't type like an American.
I'm wondering where Windows got the traditional Mongolian text from? I only ask because I think that the word for "Mongolian" (ᠮᠤᠨᠭᠭᠤᠯ = munggul) is misspelt. Not that it surprises me, as the deeply flawed encoding model for Mongolian makes homographic misspellings inevitable (Mongolian "o" and "u" are essentially the same letter, but have been encoded separately on phonetic grounds rather than unified on the basis of their glyph shape, as is the case with every other Unicode script -- akin to encoding two Latin letter C's, one "hard c" and one "soft c"). Although the Windows spelling gets the greatest number of google hits (precicely because it is used in Windows?), I think the "u"s should be "o"s, and the "ngg" should be "ŋg". There seem to be four different spellings current on the internet:
blogs.msdn.com/.../10167604.aspx = ᠮᠤᠨᠭᠭᠤᠯ munggul <182E 1824 1828 182D 182D 1824 182F>
en.wikipedia.org/.../Mongolian_language = ᠮᠣᠨᠭᠭᠣᠯ monggol <182E 1823 1828 182D 182D 1823 182F>
www.geonames.de/coumn.html = ᠮᠣᠩᠭ᠋ᠣᠯ moŋg{VS1}ol <182E 1823 1829 182D 180B 1823 182F>
en.wikipedia.org/.../Mongolian_script = ᠮᠣᠩᠭᠣᠯ moŋgol <182E 1823 1829 182D 1823 182F>
The above examples of the same word are all represented using slightly different characters ("o" vs "u", "ng" vs "ŋ", and "g with a variation selector" vs "g without a variation selector"), but on systems that support traditional Mongolian the first three all look almost identical (if you look closely you'll see that with Mongolian Baiti the shape of the first "g" in spellings with "ngg" is not quite the same as the shape of the "ŋ" in spellings with "ŋg" -- the character "ŋ" is in origin a ligature of "n" and "g" so you would expect "ŋ" and "ng" to look very similar). The last spelling (ᠮᠣᠩᠭᠣᠯ moŋgol), without the VS, lacks the two dots to the side of the "g". As the distinction between U+1823 and U+1824 is artificial I suspect that most Mongolian writers just use whatever letters appear to give the correct result, and so eye spellings proliferate; and the Windows spelling "munggul" *looks* corerct, so who's going to complain? I believe (but am no longer sure) that the correct spelling should be ᠮᠣᠩᠭ᠋ᠣᠯ moŋg{VS1}ol with a variation selector to add two dots to the preceding "g", but this is the spelling that gets the least google hits (well, Google ignores the VS, which is correct from a a Unicode perspective, but I actually think it is wrong from a user perspective as Mongolian variation selectors are not really "ignorable" in the same way that ordinary variation selectors are). This may simply be because variation selectors are awkward to use, and so people prefer spellings without them -- typing "n" + "g" gets the same glyph as "ŋ", but automatically gets the dots under the following "g", whereas with "ŋ" you need to add a VS after the following "g", with the result that (the incorrect?) "n" + "g" wins over (the correct?) "ŋ". Quite why there are spellings with "u" instead of "o", I have absolutely no idea, as all scripts used for writing Mongolian that visually distinguish "o" and "u" spell the word with "o" (e.g. modern Cyrillic Монгол and 14th century Phags-pa ꡏꡡꡃ ꡣꡡꡙ).
Hey Andrew! I think there's blame to go around, since better input methods in Windows and Office spellcheckers could go a long way to improve the situation, even with trouble in Unicode's encoding....but either way I forwarded the issues you mentioned on to see what we can do about them.
Names from around the world.
Back in the end of May, in The Locales of Windows 7, all divvied up , I had a bunch of tables in it:
Previous blogs from this series:
part 12: Logic dictates that we keep a sense of proportion about
part 18 (Two scripts that share ten digits can be trouble)
Now in the past, I've written The Locales of Windows 7, all divvied up , which included:
Table