Today, I am going to talk about language groups. They are a little confusing....

First we will take their names, straight from our master NLS header file, winnls.h (taken from the one that ships with VS.NET 2003):

//
//  Language Group ID Values.
//
#define LGRPID_WESTERN_EUROPE        0x0001   // Western Europe & U.S.
#define LGRPID_CENTRAL_EUROPE        0x0002   // Central Europe
#define LGRPID_BALTIC                0x0003   // Baltic
#define LGRPID_GREEK                 0x0004   // Greek
#define LGRPID_CYRILLIC              0x0005   // Cyrillic
#define LGRPID_TURKISH               0x0006   // Turkish
#define LGRPID_JAPANESE              0x0007   // Japanese
#define LGRPID_KOREAN                0x0008   // Korean
#define LGRPID_TRADITIONAL_CHINESE   0x0009   // Traditional Chinese
#define LGRPID_SIMPLIFIED_CHINESE    0x000a   // Simplified Chinese
#define LGRPID_THAI                  0x000b   // Thai
#define LGRPID_HEBREW                0x000c   // Hebrew
#define LGRPID_ARABIC                0x000d   // Arabic
#define LGRPID_VIETNAMESE            0x000e   // Vietnamese
#define LGRPID_INDIC                 0x000f   // Indic
#define LGRPID_GEORGIAN              0x0010   // Georgian
#define LGRPID_ARMENIAN              0x0011   // Armenian

This list will help you start feeling the confusion. They seem to be based on language. No thats not right, it must be script. No, maybe based on region. No? Perhaps something else entirely.

Heck, the English - New Zealand locale shows up under the "Western European" language group. Riddle me that one, won't you?

In Windows 2000 language groups matched the big list of "languages your system supports" in Regional Options and that is when language groups were analagous to a feature in the operting system.

Ok, maybe NLS Terminology will set us straight on what they are supposed to be:

Purpose: Provides all keyboard layouts, IMEs, TrueType fonts, font links, LPKs, bitmap fonts and code page translation tables needed by the system for a group of languages. Therefore impacts all other settings in this list.

Expository text: The language group controls which system locale, user locales, input locales, and user interface (UI) languages can be selected. For example, Windows installs the Western Europe and United States language group by default. This default cannot be removed. For each localized version, the specified language group is the default and cannot be removed. Thus, if the English version of Windows is installed in a non-English speaking country/region, the user will typically install another language group.

When adding a language group, Windows copies (but does not activate) the necessary keyboard files, Input Method Editors (IMEs), TrueType Font files, bitmap font files, and National Language Support (.nls) files. Adding a language group also adds registry values for font linking and installs scripting engines for complex script languages (Arabic, Hebrew, Indic, and Thai).

Giving (for example) Armenian its own language group did not really serve as much purpose here since the font was pretty small, there is no IME, and no special system support is required like with other language groups. So they do not feel like a big group of equal partners....

Then, starting in Windows XP and Server 2003 the notion has largely been replaced by two checkboxes in the second tab of Regional and Language Options -- essentially giving us just three groups:

  1. East Asian languages (basically ideographic scripts e.g. Chinese)
  2. Complex script languages (e.g. Thai, Hebrews, Indic, Arabic script)
  3. Everyone else the system supports

The idea was that Windows would support everybody unless it required a ton of IME/font support files (category 1) or turning on complex script support throughout the OS (category 2).

And in XP and Server 2003, even trying to install language group support a-la-Q289125 will go in and decide which of the above categories the request was in, and install that entire category's underlying technological support. So now language groups (which were never too clearly defined anyway) are at the level of an appendix or a vestigial tail!

What happens with them going forward? Who can say? In a world where parts of Australia are classified as being Western European, anything is possible! :-)

 

This post brought to you by "Ƣ" (U+01a2, a.k.a. LATIN CAPITAL LETTER OI)