Blog - Title

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!
  • Sorting it all Out

    Keyboard driver development isn't easy

    • 2 Comments

    Over in the Suggestion Box, Daniela Semeco asked:

    Hi Michael,
     
    I have enjoyed reading your blog a great deal. I am inventor, and I'm looking for a keyboard device driver developer. Could you please recommend someone to me or suggest where to look? I've done what I could with MSKLC and KBDEdit, and I realize that this will not suffice. I need something to be written from scratch. I am in San Francisco and would like to hire someone in the Bay Area. I would feel more comfortable meeting them in person, especially since it involves intellectual property. I'm looking to have software developed for Windows, Mac and Linux OS, as well as for iPad and Android tablet. I would appreciate any feedback. Many thanks!
     
    -Daniela

    The landscape in this space is indeed bleak.

    My first thought would have been Mark Durdin and Tavultesoft, though despite his visits to the Bay Area and Seattle, he is still located in Australia.

    He should be up here and down there in the near future if you wanted to meet, and I know he has dug deeper into the keyboard space than anyone outside of Microsoft than anyone I've ever known (and most of the people inside, too!).

    He may reach out to you after seeing this blog, if not feel free to reach out to him....

    Beyond that, I've been pretty unimpressed with others I've interacted with in this space.

    And I've interacted with a lot of them over the years.

    It's hard to say more without knowing more about what you're looking for; perhaps there are people who would be helpful in some specific area?

    Or if you're at the Internationalization and Unicode Conference in Santa Clara in October (site), you can see me there and tell me more....

  • Sorting it all Out

    Some notes about The Locales of Windows 8, not yet divvied up...

    • 4 Comments

    The other day my blog The Locales of Windows 8, not yet divvied up... went up.

    Some people made observations.

    Some people asked questions.

    A few of them even did all that in the comments....

    Like Andrew West, who commented:

    I've said it before, and I'll say it one last time, I think the MS spelling of ᠮᠤᠨᠭᠭᠤᠯ (m-u-n-g-g-u-l) for the more usual ᠮᠣᠩᠭᠣᠯ (m-o-ng-g-o-l) is bizarre.
     
    On a different matter, can you explain why Basque, Catalan and Cherokee repeat the language name in parentheses instead of giving the name of the country in which the language is spoken, as is the case with all the other locales on the list?  And why is "Iran" present in the native name for Persian but omitted from the display name?

    I forwarded on that Mongolian comment previously, but I will do it again. :-)

    For Persian I'm not sure - maybe that whole "for the ex-pats, with no contact with the country" thing came into play.

    I think the fact that we don't say it's Farsi anymore was the biggest issue for them.

    Those three cases where the BCP-47 tag has a country/region but the name does not -- and the fourth case involving moh-CA aka Mohawk (Mohawk) -- are I think just cases where the locale is covering dependent nations (not sovereign nations) where it just makes people more comfortable. Just trying to be a little respectful....

    And then Azarian pointed out some of the same things and some new things:

    Interesting:
     
    English (Caribbean) having weird code of en-029.
     
    English (Republic of the Philippines) having native name of English (Philippines).
     
    Cherokee (Cherokee) and Catalan (Catalan) with a non-country country name.
     
    Persian with no country name at all.
     
    Norwegian, Bokmål (Norway) being the only Display Name with a non-ASCII character.
     
    And no less than 6 Sami locales.

    Now the English (Philippines) vs. English (Republic of the Philippines) case is just a side effect of the English display name coming from us and the native name coming from the native reviewer.

    It is a good candidate to clean up at some point -- we don't have a good reason to use the long name if the language expert doesn't think it's important! :-)

    The Norwegian, Bokmål (Norway) case is a longstanding issue we've just always had, in part because no Norwegian we've ever worked with or talked to found it acceptable to use Bokmal instead.

    As for that code, Doug Ewell explained it:

    @Azarien: '029' is the UN M.49 code element for the Caribbean.

    It's a little embarrassing, but we had this English (Caribbean) locale, but we had no ISO-3166 code for it. So someone found that 029 and we went with it.

    There's probably something smarter we could have done later via BCP-47 but no one wanted to change the name again, so it stayed that way.

    As for the "no less than 6 Sami locales" -- we have nine Sami locales!

    Doug Ewell's other comment:

    'ko-KR' should not expand to "Korean (Korea)." The name of the country is Republic of Korea, or South Korea, or whatever. Just because North Korea is isolationist and the is no localized version of Windows for it, that doesn't mean it doesn't exist. This is particularly noticeable given the indulgent "Bolivarian Republic of Venezuela."

    Since we don't have North Korea or DPRK on the list, the usage of South Korea is not necessary -- it's the only Korea we ship to. :-)

    So it is our "short name"....

    I wouldn't consider the Venezuela name to be indulgent -- but someone asked us to update the English display name so we did. It's the kind of thing we could probably do for English (Philippines) at some point.

    The last comment there was from Paul B.:

    Great. Next, tell us the delta from Win7 to Win8 - what's new and what's changed? :-)

    Now that would have to be a whole new blog, some other day!

    I was disappointed no one asked about the difference with the display name and native name of Macedonia, but maybe people knew I already covered it....

  • Sorting it all Out

    There is no LOCALE_INATIVENAMECANBECAPITALIZEDIFYOUREALLYREALLYWANTTODOTHAT flag

    • 0 Comments

    You may remember prior blogs like Regarding the overthinking and underimplementing of names from last year that talks a bit about the names we cart around or Maybe they're just showing off their fancy fonts? ;-) from a few months ago that openly pointed out one of the big problems of using native languge names in user interfaces.

    The simple fact I pointed out:

    The odd one out here is the Native Names, since each one is most often provided by a different person, one per language - sometimes based on preferences or standards requirements or grammatical rules in each language!

    Anyway, someone was asking again about the different capitalizations in different languages in some other UI was showing.

    And someone else, in partial response, pointed out that someone else owned the data, but it may be per language choices. And it continued:

    They may also know, assuming that’s true, if there are scenarios in which it’s appropriate for a component for force capitalization.

    It is true.

    But we don't track when languages would be okay with capitalizing being done by someone else later.

    There is no LOCALE_INATIVENAMECANBECAPITALIZEDIFYOUREALLYREALLYWANTTODOTHAT flag

    I shudder to think of how we would manage all that, myself.

    Luckily, the person asking just went ahead and resolved the issue as by design, since it is. ;-)

  • Sorting it all Out

    The Locales of Windows 8, not yet divvied up...

    • 6 Comments

    Now in the past, I've written The Locales of Windows 7, all divvied up, which included:

    • Table 1: the locales representing languages into which Windows 7 localizes
    • Table 2: the locales representing languages for which Windows creates Language Interface Packs, aka LIPs
    • Table 3: locales whose identifiers are not directly associated with any localizations of Windows, even if a related identifier might make for one representing a suitable localization

    I've also written the sequel, The Locales of Windows 7, divvied up further, which included the slihttky more niche:

    • Table 4: the locales into which Windows Server 2008 R2 is localized
    • Table 5: the locales into which PowerShell is localized, by Microsoft
    • Table 6: the locales into which Visual Studio is localized, by Microsoft

    Since then, I've also written The evolving Story of Locale Support, part 13: Divvying up locales, yet again!.

    But one list has never yet been published.

    And I'm gonna publish it now. :-)

    Table 8:  The locales of Windows 8 (full list)

    Name Display Name Native Name
    af-ZA Afrikaans (South Africa) Afrikaans (Suid-Afrika)
    am-ET Amharic (Ethiopia) አማርኛ (ኢትዮጵያ)
    ar-AE Arabic (U.A.E.) العربية (الإمارات العربية المتحدة)
    ar-BH Arabic (Bahrain) العربية (البحرين)
    ar-DZ Arabic (Algeria) العربية (الجزائر)
    ar-EG Arabic (Egypt) العربية (مصر)
    ar-IQ Arabic (Iraq) العربية (العراق)
    ar-JO Arabic (Jordan) العربية (الأردن)
    ar-KW Arabic (Kuwait) العربية (الكويت)
    ar-LB Arabic (Lebanon) العربية (لبنان)
    ar-LY Arabic (Libya) العربية (ليبيا)
    ar-MA Arabic (Morocco) العربية (المملكة المغربية)
    arn-CL Mapudungun (Chile) Mapudungun (Chile)
    ar-OM Arabic (Oman) العربية (عمان)
    ar-QA Arabic (Qatar) العربية (قطر)
    ar-SA Arabic (Saudi Arabia) العربية (المملكة العربية السعودية)
    ar-SY Arabic (Syria) العربية (سوريا)
    ar-TN Arabic (Tunisia) العربية (تونس)
    ar-YE Arabic (Yemen) العربية (اليمن)
    as-IN Assamese (India) অসমীয়া (ভাৰত)
    az-Cyrl-AZ Azeri (Cyrillic, Azerbaijan) Азәрбајҹан (Азәрбајҹан)
    az-Latn-AZ Azeri (Latin, Azerbaijan) Azərbaycan dili (Azərbaycan)
    ba-RU Bashkir (Russia) Башҡорт (Рәсәй)
    be-BY Belarusian (Belarus) Беларуская (Беларусь)
    bg-BG Bulgarian (Bulgaria) български (България)
    bn-BD Bengali (Bangladesh) বাংলা (বাংলাদেশ)
    bn-IN Bengali (India) বাংলা (ভারত)
    bo-CN Tibetan (PRC) བོད་ཡིག (ཀྲུང་ཧྭ་མི་དམངས་སྤྱི་མཐུན་རྒྱལ་ཁབ།)
    br-FR Breton (France) brezhoneg (Frañs)
    bs-Cyrl-BA Bosnian (Cyrillic, Bosnia and Herzegovina) босански (Босна и Херцеговина)
    bs-Latn-BA Bosnian (Latin, Bosnia and Herzegovina) bosanski (Bosna i Hercegovina)
    ca-ES Catalan (Catalan) Català (Català)
    ca-ES-valencia Valencian (Spain) Valencià (Espanya)
    chr-Cher-US Cherokee (Cherokee) ᏣᎳᎩ (ᏣᎳᎩ)
    co-FR Corsican (France) Corsu (Francia)
    cs-CZ Czech (Czech Republic) čeština (Česká republika)
    cy-GB Welsh (United Kingdom) Cymraeg (Y Deyrnas Unedig)
    da-DK Danish (Denmark) dansk (Danmark)
    de-AT German (Austria) Deutsch (Österreich)
    de-CH German (Switzerland) Deutsch (Schweiz)
    de-DE German (Germany) Deutsch (Deutschland)
    de-LI German (Liechtenstein) Deutsch (Liechtenstein)
    de-LU German (Luxembourg) Deutsch (Luxemburg)
    dsb-DE Lower Sorbian (Germany) dolnoserbšćina (Nimska)
    dv-MV Divehi (Maldives) ދިވެހިބަސް (ދިވެހި ރާއްޖެ)
    el-GR Greek (Greece) Ελληνικά (Ελλάδα)
    en-029 English (Caribbean) English (Caribbean)
    en-AU English (Australia) English (Australia)
    en-BZ English (Belize) English (Belize)
    en-CA English (Canada) English (Canada)
    en-GB English (United Kingdom) English (United Kingdom)
    en-IE English (Ireland) English (Ireland)
    en-IN English (India) English (India)
    en-JM English (Jamaica) English (Jamaica)
    en-MY English (Malaysia) English (Malaysia)
    en-NZ English (New Zealand) English (New Zealand)
    en-PH English (Republic of the Philippines) English (Philippines)
    en-SG English (Singapore) English (Singapore)
    en-TT English (Trinidad and Tobago) English (Trinidad and Tobago)
    en-US English (United States) English (United States)
    en-ZA English (South Africa) English (South Africa)
    en-ZW English (Zimbabwe) English (Zimbabwe)
    es-AR Spanish (Argentina) Español (Argentina)
    es-BO Spanish (Bolivia) Español (Bolivia)
    es-CL Spanish (Chile) Español (Chile)
    es-CO Spanish (Colombia) Español (Colombia)
    es-CR Spanish (Costa Rica) Español (Costa Rica)
    es-DO Spanish (Dominican Republic) Español (República Dominicana)
    es-EC Spanish (Ecuador) Español (Ecuador)
    es-ES Spanish (Spain) Español (España, alfabetización internacional)
    es-GT Spanish (Guatemala) Español (Guatemala)
    es-HN Spanish (Honduras) Español (Honduras)
    es-MX Spanish (Mexico) Español (México)
    es-NI Spanish (Nicaragua) Español (Nicaragua)
    es-PA Spanish (Panama) Español (Panamá)
    es-PE Spanish (Peru) Español (Perú)
    es-PR Spanish (Puerto Rico) Español (Puerto Rico)
    es-PY Spanish (Paraguay) Español (Paraguay)
    es-SV Spanish (El Salvador) Español (El Salvador)
    es-US Spanish (United States) Español (Estados Unidos)
    es-UY Spanish (Uruguay) Español (Uruguay)
    es-VE Spanish (Bolivarian Republic of Venezuela) Español (Republica Bolivariana de Venezuela)
    et-EE Estonian (Estonia) eesti (Eesti)
    eu-ES Basque (Basque) euskara (euskara)
    fa-IR Persian فارسى (ایران)
    ff-Latn-SN Fulah (Latin, Senegal) Fulah (Sénégal)
    fi-FI Finnish (Finland) suomi (Suomi)
    fil-PH Filipino (Philippines) Filipino (Pilipinas)
    fo-FO Faroese (Faroe Islands) føroyskt (Føroyar)
    fr-BE French (Belgium) français (Belgique)
    fr-CA French (Canada) français (Canada)
    fr-CH French (Switzerland) français (Suisse)
    fr-FR French (France) français (France)
    fr-LU French (Luxembourg) français (Luxembourg)
    fr-MC French (Monaco) français (Principauté de Monaco)
    fy-NL Frisian (Netherlands) Frysk (Nederlân)
    ga-IE Irish (Ireland) Gaeilge (Éire)
    gd-GB Scottish Gaelic (United Kingdom) Gàidhlig (An Rìoghachd Aonaichte)
    gl-ES Galician (Galician) galego (galego)
    gsw-FR Alsatian (France) Elsässisch (Frànkrisch)
    gu-IN Gujarati (India) ગુજરાતી (ભારત)
    ha-Latn-NG Hausa (Latin, Nigeria) Hausa (Nijeriya)
    haw-US Hawaiian (United States) Hawaiʻi (ʻAmelika)
    he-IL Hebrew (Israel) עברית (ישראל)
    hi-IN Hindi (India) हिंदी (भारत)
    hr-BA Croatian (Latin, Bosnia and Herzegovina) hrvatski (Bosna i Hercegovina)
    hr-HR Croatian (Croatia) hrvatski (Hrvatska)
    hsb-DE Upper Sorbian (Germany) hornjoserbšćina (Němska)
    hu-HU Hungarian (Hungary) magyar (Magyarország)
    hy-AM Armenian (Armenia) Հայերեն (Հայաստան)
    id-ID Indonesian (Indonesia) Bahasa Indonesia (Indonesia)
    ig-NG Igbo (Nigeria) Igbo (Nigeria)
    ii-CN Yi (PRC) ꆈꌠꁱꂷ (ꍏꉸꏓꂱꇭꉼꇩ)
    is-IS Icelandic (Iceland) íslenska (Ísland)
    it-CH Italian (Switzerland) italiano (Svizzera)
    it-IT Italian (Italy) italiano (Italia)
    iu-Cans-CA Inuktitut (Syllabics, Canada) ᐃᓄᒃᑎᑐᑦ (ᑲᓇᑕᒥ)
    iu-Latn-CA Inuktitut (Latin, Canada) Inuktitut (Kanatami)
    ja-JP Japanese (Japan) 日本語 (日本)
    ka-GE Georgian (Georgia) ქართული (საქართველო)
    kk-KZ Kazakh (Kazakhstan) Қазақ (Қазақстан)
    kl-GL Greenlandic (Greenland) kalaallisut (Kalaallit Nunaat)
    km-KH Khmer (Cambodia) ភាសាខ្មែរ (កម្ពុជា)
    kn-IN Kannada (India) ಕನ್ನಡ (ಭಾರತ)
    kok-IN Konkani (India) कोंकणी (भारत)
    ko-KR Korean (Korea) 한국어(대한민국)
    ku-Arab-IQ Central Kurdish (Iraq) کوردیی ناوەڕاست (کوردستان)
    ky-KG Kyrgyz (Kyrgyzstan) Кыргыз (Кыргызстан)
    lb-LU Luxembourgish (Luxembourg) Lëtzebuergesch (Lëtzebuerg)
    lo-LA Lao (Lao P.D.R.) ພາສາລາວ (ສປປ ລາວ)
    lt-LT Lithuanian (Lithuania) lietuvių (Lietuva)
    lv-LV Latvian (Latvia) latviešu (Latvija)
    mi-NZ Maori (New Zealand) Reo Māori (Aotearoa)
    mk-MK Macedonian (Former Yugoslav Republic of Macedonia) македонски јазик (Македонија)
    ml-IN Malayalam (India) മലയാളം (ഭാരതം)
    mn-MN Mongolian (Cyrillic, Mongolia) Монгол хэл (Монгол улс)
    mn-Mong-CN Mongolian (Traditional Mongolian, PRC) ᠮᠤᠨᠭᠭᠤᠯ ᠬᠡᠯᠡ (ᠪᠦᠭᠦᠳᠡ ᠨᠠᠢᠷᠠᠮᠳᠠᠬᠤ ᠳᠤᠮᠳᠠᠳᠤ ᠠᠷᠠᠳ ᠣᠯᠣᠰ)
    moh-CA Mohawk (Mohawk) Kanien'kéha
    mr-IN Marathi (India) मराठी (भारत)
    ms-BN Malay (Brunei Darussalam) Bahasa Melayu (Brunei Darussalam)
    ms-MY Malay (Malaysia) Bahasa Melayu (Malaysia)
    mt-MT Maltese (Malta) Malti (Malta)
    nb-NO Norwegian, Bokmål (Norway) norsk, bokmål (Norge)
    ne-NP Nepali (Nepal) नेपाली (नेपाल)
    nl-BE Dutch (Belgium) Nederlands (België)
    nl-NL Dutch (Netherlands) Nederlands (Nederland)
    nn-NO Norwegian, Nynorsk (Norway) norsk, nynorsk (Noreg)
    nso-ZA Sesotho sa Leboa (South Africa) Sesotho sa Leboa (Afrika Borwa)
    oc-FR Occitan (France) Occitan (França)
    or-IN Oriya (India) ଓଡ଼ିଆ (ଭାରତ)
    pa-Arab-PK Punjabi (Islamic Republic of Pakistan) پنجابی (پاکستان)
    pa-IN Punjabi (India) ਪੰਜਾਬੀ (ਭਾਰਤ)
    pl-PL Polish (Poland) polski (Polska)
    prs-AF Dari (Afghanistan) درى (افغانستان)
    ps-AF Pashto (Afghanistan) پښتو (افغانستان)
    pt-BR Portuguese (Brazil) Português (Brasil)
    pt-PT Portuguese (Portugal) português (Portugal)
    qut-GT K'iche (Guatemala) K'iche' (Guatemala)
    quz-BO Quechua (Bolivia) runasimi (Qullasuyu)
    quz-EC Quechua (Ecuador) runa shimi (Ecuador Suyu)
    quz-PE Quechua (Peru) runasimi (Peru)
    rm-CH Romansh (Switzerland) Rumantsch (Svizra)
    ro-RO Romanian (Romania) română (România)
    ru-RU Russian (Russia) русский (Россия)
    rw-RW Kinyarwanda (Rwanda) Kinyarwanda (Rwanda)
    sah-RU Sakha (Russia) Саха (Россия)
    sa-IN Sanskrit (India) संस्कृत (भारतम्)
    sd-Arab-PK Sindhi (Islamic Republic of Pakistan) سنڌي (پاکستان)
    se-FI Sami, Northern (Finland) davvisámegiella (Suopma)
    se-NO Sami, Northern (Norway) davvisámegiella (Norga)
    se-SE Sami, Northern (Sweden) davvisámegiella (Ruoŧŧa)
    si-LK Sinhala (Sri Lanka) සිංහල (ශ්‍රී ලංකා)
    sk-SK Slovak (Slovakia) slovenčina (Slovenská republika)
    sl-SI Slovenian (Slovenia) slovenski (Slovenija)
    sma-NO Sami, Southern (Norway) åarjelsaemiengïele (Nöörje)
    sma-SE Sami, Southern (Sweden) åarjelsaemiengïele (Sveerje)
    smj-NO Sami, Lule (Norway) julevusámegiella (Vuodna)
    smj-SE Sami, Lule (Sweden) julevusámegiella (Svierik)
    smn-FI Sami, Inari (Finland) sämikielâ (Suomâ)
    sms-FI Sami, Skolt (Finland) sää´mǩiõll (Lää´ddjânnam)
    sq-AL Albanian (Albania) Shqip (Shqipëria)
    sr-Cyrl-BA Serbian (Cyrillic, Bosnia and Herzegovina) српски (Босна и Херцеговина)
    sr-Cyrl-CS Serbian (Cyrillic, Serbia and Montenegro (Former)) српски (Србија и Црна Гора (Бивша))
    sr-Cyrl-ME Serbian (Cyrillic, Montenegro) српски (Црна Гора)
    sr-Cyrl-RS Serbian (Cyrillic, Serbia) српски (Србија)
    sr-Latn-BA Serbian (Latin, Bosnia and Herzegovina) srpski (Bosna i Hercegovina)
    sr-Latn-CS Serbian (Latin, Serbia and Montenegro (Former)) srpski (Srbija i Crna Gora (Bivša))
    sr-Latn-ME Serbian (Latin, Montenegro) srpski (Crna Gora)
    sr-Latn-RS Serbian (Latin, Serbia) srpski (Srbija)
    sv-FI Swedish (Finland) svenska (Finland)
    sv-SE Swedish (Sweden) svenska (Sverige)
    sw-KE Kiswahili (Kenya) Kiswahili (Kenya)
    syr-SY Syriac (Syria) ܣܘܪܝܝܐ (ܣܘܪܝܐ)
    ta-IN Tamil (India) தமிழ் (இந்தியா)
    ta-LK Tamil (Sri Lanka) தமிழ் (இலங்கை)
    te-IN Telugu (India) తెలుగు (భారత దేశం)
    tg-Cyrl-TJ Tajik (Cyrillic, Tajikistan) Тоҷикӣ (Тоҷикистон)
    th-TH Thai (Thailand) ไทย (ไทย)
    ti-ER Tigrinya (Eritrea) ትግርኛ (ኤርትራ)
    ti-ET Tigrinya (Ethiopia) ትግርኛ (ኢትዮጵያ)
    tk-TM Turkmen (Turkmenistan) Türkmen dili (Türkmenistan)
    tn-BW Setswana (Botswana) Setswana (Botswana)
    tn-ZA Setswana (South Africa) Setswana (Aforika Borwa)
    tr-TR Turkish (Turkey) Türkçe (Türkiye)
    tt-RU Tatar (Russia) Татар (Россия)
    tzm-Latn-DZ Tamazight (Latin, Algeria) Tamazight (Djazaïr)
    tzm-Tfng-MA Central Atlas Tamazight (Tifinagh, Morocco) ⵜⴰⵎⴰⵣⵉⵖⵜ (ⵍⵎⵖⵔⵉⴱ)
    ug-CN Uyghur (PRC) ئۇيغۇرچە (جۇڭخۇا خەلق جۇمھۇرىيىتى)
    uk-UA Ukrainian (Ukraine) українська (Україна)
    ur-PK Urdu (Islamic Republic of Pakistan) اُردو (پاکستان)
    uz-Cyrl-UZ Uzbek (Cyrillic, Uzbekistan) Ўзбекча (Ўзбекистон Республикаси)
    uz-Latn-UZ Uzbek (Latin, Uzbekistan) O'zbekcha (O'zbekiston Respublikasi)
    vi-VN Vietnamese (Vietnam) Tiếng Việt (Việt Nam)
    wo-SN Wolof (Senegal) Wolof (Senegaal)
    xh-ZA isiXhosa (South Africa) isiXhosa (uMzantsi Afrika)
    yo-NG Yoruba (Nigeria) Yoruba (Nigeria)
    zh-CN Chinese (Simplified, PRC) 中文(中华人民共和国)
    zh-HK Chinese (Traditional, Hong Kong S.A.R.) 中文(香港特別行政區)
    zh-MO Chinese (Traditional, Macao S.A.R.) 中文(澳門特別行政區)
    zh-SG Chinese (Simplified, Singapore) 中文(新加坡)
    zh-TW Chinese (Traditional, Taiwan) 中文(台灣)
    zu-ZA isiZulu (South Africa) isiZulu (iNingizimu Afrika)

     I'll divvy them up another time.

    For now I'll love the list!

  • Sorting it all Out

    The evolving Story of Locale Support, part 26: Hey Windows 8, there's someone on the phone for you.

    • 2 Comments

    Previous blogs from this series:

    Back in the end of June, Todd Brix wrote a blog on the Windows Phone Developer Blog entitled First look at the Windows Phone 8 Marketplace.

    In it, there was a huge list of regions, covering large parts of the world.

    180 places being covered by the AppHub and the Marketplace.

    This is huge, as is the already announced news that Windows Phone 8 would be heavily using the Windows 8 core.

    A core that includes the Windows 8 locale data!

    The Windows Phone 8 team has on it many people who have been holding our feet to the fire as they had the combination of the joyful experience of more coverage and updated data with the painful experience of changes and even a few bugs.

    But it was really amazing to work with them and partner with them here -- we are a better product because of the work they were doing as a part of their product....

    One of the most frequent sources of werdness was inconsistencies between data fields that would reasonably be considered related -- like LOCALE_SSHORTTIME and LOCALE_STIMEFORMAT.

    In the end, when they were different it was just due to historical reasons, and the lack of anyone reviewing them for differences to date.

    That kind of cleanup has started largely due to their diligence and reports, and further cleanup in the future will definitely take their issues into account.

    Plus it is really cool to see these things on Windows Phone 8!

  • Sorting it all Out

    No do-overs or mulligans in International!

    • 0 Comments

    It is well known that intl.cpl is a relentless beast that does what it is asked to with cunning nd ruthless efficiency.

    If you know how to ask for it, at least. :-)

    The question from the other day was simple enough:

    After installing language pack on a English OS, the welcome screen language can't be changed back to English if it was changed to another language by running following command in system context.
    control intl.cpl,, /f:"settings.xml"
     
    setting.xml contains:

    <gs:GlobalizationServices xmlns:gs="urn:longhornGlobalizationUnattend">
           <gs:UserList>
         <gs:User UserID="Current" CopySettingsToSystemAcct="true" />
         </gs:UserList>
          <gs:MUILanguagePreferences>
         <gs:MUILanguage Value="da-dk" />
         <gs:MUIFallback Value="da-DK" />
        </gs:MUILanguagePreferences>
        </gs:GlobalizationServices>

    I don't know about you, but I like it when the answer is right there in the question!

    Did you see it?

    It is in the <gs:User UserID="Current" CopySettingsToSystemAcct="true" /> line.

    Because by setting the user account and requesting that the setting be copied to system accounts, one has formally requested that the copy take place.

    The user context/system context difference is simply whether the user running the script has permission to make the change!

    Just like in this UI in Windows 7/Server 2008 R2/Windows 8/Server 2012 intl.cpl:

    Copy to System Settings v1

    or this one in the Vista/Server 2008 intl.cpl:

    Copy to System Settings v2

    The follow up question was:

    Is there a way to revert this change on  machines where this XML file was already used and the machines are in problem state?

    Unfortunately, no.

    It doesn't store any previous settings.

    And there is no mulligan....

    So, to fix, one would have to:

    1. Create a new XML file that sets everything back to English and does include the CopySettingsToSystemAcct="true" piece, and then
    2. Run the original XML file that makes the desired per-user changes and does not include the CopySettingsToSystemAcct="true" piece.

    When this unattend feature was being created in Longhorn (which became Vista), this very scenario was discussed.

    But the fact that the UI didn't support it made it less interesting, so the idea was dropped, especially since just running two files could get the desired effect.

    Just like steps 1 and 2, above!

  • Sorting it all Out

    In the land of the unsupported, previous blogs and tools can be king

    • 0 Comments

    The note from the Suggestion Box was:

    Loading keyboard dlls in a 64-bit environment using a 32-bit application

    Hi Michael!

    I've been enjoying you topics regarding the keyboard dlls (kbd**.dll) files and how to mess with them :)

    Earlier I decided to create a onscreenkeyboard of my own, based on scan codes, so I could take the advantage of these existing keyboards dlls.

    In my path of development; I did not get it to work on a 64-bit system, the pVkToWcharTable was always NULL. It only worked if I compiled a 64-bit version of my app.

    I've written a class to actually manage 32-bit app on a 64-bit system, and it works. I've even shared it to the public, by writing an article: http://lars.werner.no/?p=870

    Since you always investigate deep, do you have any root cause of what actually fails?

    Hope to see an article on that later on!

    Cheers,

    Lars Werner

    http://lars.werner.no

    We don't actually document how to act like Windows does when it uses one of these keyboard layout DLLs.

    Previous travails, discussed in The wacky world of WOW64 keyboards, un-leashed, un-locked, un-something-or-other and If you just don't think you can hold it (64-bit style!), may seem relevant.

    But they aren't -- not directly at least.

    Yet by looking at what happens differently in 64-bit builds, one can reverse engineer the differences for 64-bit builds either directly here or indirectly by looking at kbd.h and how it defines the structure differently for the 64-bit case.

    I'd help, but that would be a Microsoft employee doing the work to document the stuff that we decided not to document....

  • Sorting it all Out

    It may be easier to get your GEO on, instead...

    • 0 Comments

    The latest question that passed through the gauntlet was:

    I’m working with .NET and I need to be able to map a set of two-letter country codes (e.g. US, IE) into the appropriate RegionInfo object. I’ve got this working correctly for all regions defined in .NET, but we’ve recently discovered that there are a number of regions which are not present in .NET including Andorra and Antigua and Barbuda.
     
    I realise we can overcome this problem by creating custom cultures, but this seems like it might be a maintainability issue. 
     
    In case it’s relevant, my team is using .NET 3.5 due to various limitations for now.

    What a nightmare to create that many custom cultures, none of which would likely have useful data..

    All without getting reviewed the way we review data now! :-(

    I think the wider list can be picked up with a simple p/invoke call to EnumSystemGeoID instead.

    GEO stuff has limitations (as I mentioned in the past), especially around time zones and such, but for names it should work pretty well!

    Say code like the folllowing:

    using System;
    using System.Text;
    using System.Runtime.InteropServices;

    class Program {
     private static string stTarget;

     static void Main(string[] args) {
      if(args.Length > 0) {
       stTarget = args[0];
      }
      else {
       stTarget = string.Empty;
      }
      EnumSystemGeoID(GEOCLASS_NATION, 0, GeoInfoProc);
     }

     private static bool GeoInfoProc(Int32 geoID) {
      int len = GetGeoInfo(geoID, GEO_FRIENDLYNAME, null, 0, 0);
      if(len > 0) {
       StringBuilder data = new StringBuilder(len);
       len = GetGeoInfo(geoID, GEO_FRIENDLYNAME, data, len, 0);
       if(len > 0) {
        string stIsoCode = data.ToString(0, len - 1);
        bool fFound = string.Compare(stIsoCode,

    stTarget, StringComparison.OrdinalIgnoreCase) == 0;
        if(stTarget.Length == 0 || fFound) {
         Console.WriteLine(string.Format("{0} = {1}", stIsoCode, geoID));
        }
        return !fFound;
       }
      }
      return false;
     }

     // Geo Type
     private const uint GEO_NATION = 0x0001;
     private const uint GEO_LATITUDE = 0x0002;
     private const uint GEO_LONGITUDE = 0x0003;
     private const uint GEO_ISO2 = 0x0004;
     private const uint GEO_ISO3 = 0x0005;
     private const uint GEO_RFC1766 = 0x0006;
     private const uint GEO_LCID = 0x0007;
     private const uint GEO_FRIENDLYNAME = 0x0008;
     private const uint GEO_OFFICIALNAME = 0x0009;
     private const uint GEO_TIMEZONES = 0x000A;
     private const uint GEO_OFFICIALLANGUAGES = 0x000B;

     // Geo Class
     private const uint GEOCLASS_NATION = 16;
     private const uint GEOCLASS_REGION = 14;

     private delegate bool EnumGeoInfoProc(int geoID);

     [DllImport("kernel32.dll")]
     private static extern bool EnumSystemGeoID(uint geoClass, int parentGeoID, EnumGeoInfoProc enumGeoInfoProc);

     [DllImport("kernel32.dll", CharSet = CharSet.Unicode, EntryPoint ="GetGeoInfoW")]
     private static extern int GetGeoInfo(int geoID, uint geoType, StringBuilder geoData, int size, int langID);

    }

    It will build a nice list that happens to include all of the various names requested....

    This code or code like it should work just fine with 3.5 or 4.0 or even 4.5.

    So screw R56egionInfo! You can get your GEO on, instead....

  • Sorting it all Out

    Getting your spidey senses FIXED

    • 0 Comments

    The other day, Peter Parker (probably not his real name) asked:

    How can I determine if a font is fixed width, like Visual Studio does?

    Well, he certainly can't rely on his spidey sense even if he was the real Peter Parker, since Spiderman doesn't scale! :-)

    When he referred to "like Visual Studio does", he meant the Bold items on the list:

    Fixed width fonts are BOLD as love

     The principle is simple enough:

    • if the font identifies itself as fixed in its own data as you get by asking for a FIXED_PITCH font in the fdwPitchAndFamily member, or
    • it follows the "fixed enough" principles discussed here for East Asian fonts.

    You can see many examples of this being correctly done for Meiryo/Meiryo UI (since the Latins are based on Verdana) and Microsoft YaHei/Microsoft JhengHei (since the Latins are based on Segoe UI) and so on.

    It may even interest our Marvel action hero enough to want to emulate the checks.

    However, there are three problems here;

    • while Visual Studio correctly flags MingLiU, it misses in the three MingLiU-esque fonts that are just as fixed as well as others, and
    • almost every East Asian font is fixed width for the East Asian characters, at least, and
    • some other scripts show the same characteristics of "fixed width within the script".

    I suppose there could be an IsfixedWidth method where you can pass the font name and some text.

    It could then return whether all of that text you passed in is the exact same width per character!

    For now, you are mildly on your own checking for this.

    Even if you spin a web any size and catch thieves just like flies...

  • Sorting it all Out

    Unraveling the job creation issue

    • 11 Comments

    For the record, this is not a political Blog.

    Though I suppose this one posted blog is.

    There is no way I want to make this a common thing.

    And it goes without saying that Microsoft is not involved....

    I just have a question.

    Now politics in the US are nice and hyper-polarized, as pictures like this point out:

    GOP Translator

    Whatever.

    This is not really about the diagram.

    But a lot of the rhetoric argues between liberals who support the Buffett Rule versus conservatives who argue this would affect job creation and some say investment as well.

    Okay, let's momentarily stipulate that both points of view have merit.

    Maybe they do, at some level, after all.

    Now both job creation and investment in the economy are understandable and potentially measurable things.

    So why not just measure them?

    Why not hinge the tax rate of capital gains on whether or not jobs are created or the investment in question measurably helps the economy in some way?

    How come I have never seen such a plan suggested?

    I mean, I doubt anyone would try to publicly defend the person or company that lays off 500 people and hurts the environment willingly as deserving better tax rates than the one that hires 500 and helps build a highway.

    And if you accept that, then all you need to do is figure out what rates to use for what.

    So let's tax them according to what they accomplish with the money, and pass a bill!

    Simple? Of course not. But we are paying our lawmakers to solve tough problems.

    Like this one....

  • Sorting it all Out

    There's no "I" in IDN, part 14: It turns out there's no "I" in IE, either

    • 4 Comments

    Previous blogs in this series:

    I knew at some point I'd have to deconstruct the support of International Domain Names in Internet Explorer.

    In other words, IDN in IE. :-)

    Unfortunately, it is pretty complicated.

    So I was looking forward to it like I'd look forward to a root canal, you know?

    Thankfully, Eric Lawrence saved me the trouble!

    In EricLaw's IEInternals, he wrote Brain Dump: International Text last month, he explained the IDN settings in this picture:

    ,

    He also covered all of these "International' settings from the dialog.

    In particular:

    Send IDN server names is enabled by default and will force IE to encode hostnames in URLs following the rules of RFC3491 and RFC3492. The user will be shown the URL in the address bar in Unicode form if and only if the URL is deemed non-spoofable. Please see this IEBlog post on the rules of IDN Non-spoofability

    Send IDN server names for Intranet addresses is disabled by default for compatibility with legacy Windows networks that were using UTF-8 to support non-ASCII hostnames. Other browsers, to the best of my knowledge, do not have special handling for Intranet sites, and I believe that current versions of Active Directory and the Windows DNS server support punycoded hostname registration and lookup.

     Since he did all this work, it saves me the trouble!

    Kinda great teamwork, he and I.

    I guess there 's no "I" in IE, either! :-)

  • Sorting it all Out

    ResolveLocaleName cleans up and flies right

    • 1 Comments

    Regular readers may remember my Four cases where I don't like ResolveLocaleName (and you shouldn't either!) from a few months ago.

    I described a terrible sitation with ResolveLocaleName.

    Ths function, added in Windows 7, would do all of the following:

    In Out
    en-Latn-AU en-US
    zh-Hant-TW zh-HK
    en-Cyrl-TT en-US
    en-CanYouBelieve-THISCRAP en-US

    and I pointed out how this function works. It works badly.

    Well, one person took this to heart, my friend Brendan.

    He entered a bug on this for Windows 8 (I was actually just looking at the Win7 repro but it was I guess in Win8 too!).

    Anyhow, despite how late it was they did do a quick fix!

    So, once you have RTM bits, the results are much more reasonable:

    In Out
    en-Latn-AU en-AU
    zh-Hant-TW zh-TW
    en-Cyrl-TT en-TT
    en-CanYouBelieve-THISCRAP en-US

     Wow -- much bettter!

    Now, we just need to port it to Wndows 7... ::-)

  • Sorting it all Out

    Facebook thinks my self-esteem is for sale. WTF?

    • 4 Comments

    So, the other day, I shared a picture on Facebook.

    Nothing I haven't done before,

    As usual, I was just sharing someone else's picture (I seldom create pictures myself).

    Just passing the time.

    Sometimes people ignore them.

    Other times people "Like" or even "Share" them.

    Every once in a while one might inspire a kiss from you-know-who.

    Anyway, then something got my attention.

    Looking at the posted picture, I was given a new option:

    Shameless [paid] self-promotion

    What the hell?!?

    For the record, and to all my friends, I will never pay to announce any post. if you see it and like it then GREAT.

    But my value system in its current form requires the approval of just one person. And she isn't for sale, with or without Facebook pimp proxy.

    I hope Zuckerberg has less creepy ways for Facebook to make money post-IPO than the "buy more approval from your friends" technique.

    Probably good that friends won't be told someone announced posts this way, I can't think of a better metric for unfriending....

     

     

  • Sorting it all Out

    It's 2012! NT 3.1 just called, they want their [circa 1993] methodology back!

    • 5 Comments

    Some questions confuse me even when I know the language in which they are asked fluently.

    In my experience, the most common cause is when the person asking the question is trying to connect two things together that lack a direct, useful connection.

    In such cases, understanding what the true question involves kicking the living crap out ofdisproving the "wrong" question, and help the right one emerge.

    Take for example the following question sent the other day:

    Is there a table that shows what the delta is between [the list of supported input languages] and [the list of system locales]?  Hindi is one of the examples, and I am interested in a comprehensive list.

    Hmmm.

    Okay, we'll start by defining the two terms:

    [the list of supported input languages] - Mostly you can go to HKLM\SYSTEM\CurrentControlSet\Control\Keyboard Layouts\ and enumerate the subkey under it. This will give you  list of the keyboards.

    IMEs are not included, but all of the existing IMEs fall in two categories:

    • TextTableService TIPs for Yi, Amharic, and Tigrinya -- all three of which are Unicode only, and
    • TextTableService TIPs for Traditional Chinese and IMEs for Traditional Chinese, Simplified Chinese, Korean, and Japanese -- every one of which supports between several thousand and many tens of thousands of characters not on their nominal default system code page.

    Thus, all of them are either Unicode-only or Unicode-only enough for current purposes that they may as well all be treated that way.

    For the keyboards under that registry key, every subkey contains a KLID; if converted to  hexadecimal number, then LOWORD(<KLID>) is usually but not always a LANGID.

    You can pass those LANGID values to GetLocaleInfo(LOWORD(<klid>), LOCALE_IDEFAULTANSICODEPAGE, ..) and if you get back a 1 then you have a Unicode-only locale.

    Note that if LOWORD(<klid>) is 0x0c00, then it's either one of those locale-less keyboards we added in Windows 8 (most but not all of which are Unicode-only) or a custom keyboard created by MSKLC based on a custom locale (which does not have a known code page).

    Of the rest of the keyboards that do have ACP values, many of them contain characters outside of the corresponding default system code page.

    Now while a LOCALE_IHASLETERSOUTSIDETHECP would potentially quite useful, we don't have that. And it may not match the keyboard anyway.

    Not to mention that there is no intrinsic queryable property or attrbute of a KLID that can be used to easily identify what a keyboard supports.

    SUMMARY: for almost every keyboard, you cannot find out whether the keyboard corresponds to a code page.

    Any code page.

    Popping the stack from this disaster for a moment, let's o back to the original question.

    Now [the list of system locales] is a bit less messy to get.

    Just EnumSystemLocalesEx will let you get that list, and if needed you can use GetLocaleInfoEx(<enumerated name, LOCALE_IDEFAULTANSICODEPAGE,...) not returning 1 but returning any other number to mean it's a valid potential system locale.

    Easy, and even supports custom locales!

    Of course they all have some overlap.

    In the end, they only needed the second part anyway for what they had in mind - the real question was about working with characters off the default system codepage; keyboards were the easiest way to repro the problem and thus the repro overcomplicated things.

    But now that you wade through either list, and even try to match them up where they overlap, a reasonable person can come to just one conclusion.

    It isn't 1993 anymore.

    If you depend on code pages then...

    • ...you fail for a bunch of the locales;
    • ...you fail for a bunch of the keyboards;
    • ...you [if you don't use Unicode] you fail for a bunch of the customers....
  • Sorting it all Out

    Ya gotta have options, man!

    • 0 Comments

    Sean asked me the following contact request which I decided to make a blog:

    MSKLC allows entry of unicode characters but complains if they aren't in the locale codepage.  But there doesn't seem to be a way to set the codepage -- ugh.  How do i tell - the keyboard/windows/whatever - to just work in Unicode and forget all this 1258 nonsense???  heh... sorry for my 4am desperate ramblings...

    The message did indeed arrive at 4:26am. ;-)

    The answer can be found under View|Options... and the Options dialog it launches:

    The MSKLC Options dialog

    That CheckBox over near the bottom right, the one that says Include Validation Warnings Related to Codepages one.

    The help file topic explains how it works:

    • Include Validation Warnings Related to Code Pages - When running through the Validation process, there can be warnings related to code points currently on the keyboard layout that are not representable on the default system code page of the associated language. This indicates there may be a compatibility issue for non-Unicode applications.

    I'll tell you a secret.

    Every single option on that dialog came from feedback from either the alpha pre-release or the beta pre-release, when someone gave me a keyboard layout with weird or strange behavior.

    Like Sean's report, but earlier. :-)

    At first I always warned, but then someone built a Hindi (Unicode only) keyboard.

    So rather than a page of warnings covering nearly every key, I made it silently skip that validation for Unicode only locales.

    Then someone else tried to build an Urdu keyboard -- code page 1256 but with many letters not on the codepage -- and she wanted a way to turn off those warnings entirely since those particular ones weren't useful in her case, ever though they could be important in other situations.

    That's when I added that CheckBox.

    Later, when someone building 15 different Arabic script keyboards who was sick of having to uncheck that CheckBox over and over.

    Suddenly, the Remember Settings After Shutdown setting was born.

    And so on....

    Anyway, that should help Sean out.

    Another time I'll tell you the exact reason why I was so sensitive about tons of spurious warnings....

     

     

Page 7 of 257 (3,844 items) «56789»