As Windows XP nears the end of its extended support (see lifecycle fact sheet), I thought I'd mention Locale Names again.  I've been asking for years for people to use locale names, but now we're reaching a step closer to deprecating LCIDs.

In the modern and managed worlds we've been using Locale Names/Culture Names/Language Names for some time, but in native windows code, those pesky LCIDs have been hanging on.  Probably a huge part of that is because the locale name and newer concepts (like the Windows.Globalization APIs) were introduced in Windows Vista and later.  Applications that wanted to run on XP might still see LCIDs, but everyone else could have been using names.

However now there's an opportunity to reconsider what has probably been the default choice for many developers, and to use the modern Windows.Globalization APIs, or at least start using the Locale Name with older APIs like GetLocaleInfoEx.

Locale names are far more robust than the older LCIDs, allowing many more combinations.  In Windows 8.1, users can even select their own combination in the language profile, even if it isn't a built-in locale.  They could pick "fj-FJ" or other language names.  (I personally play with tlh-Qaak sometimes).  There are over 6000 valid language codes, and hundreds of region ids (both ISO and UN M.49), allowing an millions of possibilities for the entire world.

Compare that to LCIDs/Locale Identifiers/Language Identifiers.  The original LCID idea worked when it was developed, but it's pretty much limited to fewer than 512 "primary" languages, and 32 variations.  With languages spoken in many regions, there aren't enough sublangs, like English, yet adding an additional primary language might break applications' assumptions about which languages were English.  The macros to parse an LCID start breaking down.  To conserve the limited space, some related but different languages have been stuck in the same langid.  In other cases, thinking about the LCIDs structure has changed since their original inception.  "Neutrals" also collide with the sublangs in some cases, making it unclear if an LCID is specific or neutral.

  • LANGIDs might actually represent more than one language, it's better to ask GetLocaleInfo for the ISO Language name.
  • SUBLANGs depend on Langid.  The "US" variation of en and es for example have different sublangs.  It's far easier to ask GetLocaleInfo for the ISO region code and test that.
  • Multiple LANGIDs will likely actually represent the same language in the future.
  • Newer locales in Windows 8.1 don't have assigned LCIDs (they can all return the same LCID, or transient variations if they're an explicit user preference). 

So, if you aren't already using language/locale names, please think about moving to them.  Your code will be more understandable, future proof and robust.

And whether you're using names or LCIDs, the user default is often a good choice (eg, pass LOCALE_NAME_USER_DEFAULT to your GetLocaleInfoEx call)


PS: If you're still stuck for a while on LCIDs, please at least don't expect the LANGID parsing macros to be perfect or useful while you transition towards names.  Also realize that some user settings are going to end up with the LOCALE_CUSTOM_DEFAULT or other variations.  The LCID might be OK to query GetLocaleInfo for information, but on a later date or a different machine, it might have a different meaning.