Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
Previous blogs from this series:
Today, I'll highlight one of the weaknesses of the way some of the work to extend locales shows our reach exceeding our grasp.
Now I have talked about digit substitution so much in the past that most regular readers are probably tired of hearing about it....
Frankly, I don't blame them!
So I'm not gonna be all about digit substitution this time.
Since that feature is locale based anyway, and the languages I am going to talk about here have none, it isn't relevant anyway....
But beyond that, Tai Le (languages using it include Dehong Dai) has an additional problem -- they use the Myanmar digits!
We worked around the fact of wondering how to make sure digits get seen in some cases is by adding the Myanmar Digits (U+1040 to U+1049) to both the Microsoft Tai Le font that shipped previously and the new Myanmar Text font:
You'll notice that they look slightly different, reflecting two entirely different traditions, as mentioned in the old proposal N2372:
In addition to these differences, there qare also slight size differences between glyphs in the two fonts.
Which were desgned by two different typographers
In two different styles.
To support two different scripts.
Plus it appears they didn't even completely capture the distinctions mentioned above -- I need to find out if that's a bug or not (perhaps the alternate glyphs are in font but only available using adavnaced OpenType features, which as I mentioned previously few technologies do).
Just ten little digits:
And of course Myanmar Text also has the Myanmar Shan Digits in it, which are not in the Microsoft Tai Le font update:
Kind of funny how Unicode decided to capture those differences but not the others.
Not funny "ha ha", if you know what I mean.
I guess the Tai Le differences weren't different enough....
But leaving the Shan digits aside, lets consider the two ways of looking at the standard Myanmar digits (U+1040 to U+1049) in these two fonts.
Uniscribe and the like have several options fo how to display these digits:
At the time Tai Le support addition was discussed in Unicode (pre-Unicode 4.0), this very issue and also the different forms were discussed, and almost led to both sets of digits being defined in Unicode in the two different blocks, though the theoretical nature of the first problem (Microsoft wouldn't add support until years later in Windows 7) and the fact that the second problem was widely treated as a minor typographic issue kept one set being used by both scripts.
And to our current troubles....
Now I don't want to imply that either Tai Le or Myanmar are not good sharers or that they both need a time out.
Well, not exactly.... :-)
There are many reasonable language experts and font developers who will consider the last two bullet points above as genuine bugs that beak their support that used to work, in prior versions.
As bad as the third point is, imagine incompletely re-applying the font in the fourth case -- a problem that many people might recognize from longstanding problems with Japanese text partially rendered with Chinese fonts!
Those people aren't wrong; there are just disadvantages to working on the edges of languages, of locales, that we start to support....
And of course in the case of issues in Word and RichEdit, there are disadvantages to not carefully dealing with the well-intended (though in my opinion somewhat flawed) designs of some programs and controls.
It is in its own way somewhat ironic that the default behavior doesn't point in the direction of the script and language used for Chinese minority language support. Oops!
But we can just keep that our little secret, right? :-)
This seems fairly in-line with Unicode precedent though, notably Han unification. It always seems to cause controversy though! Perhaps Unicode should define a set of language/glyph control codepoints?
Previous blogs from this series:
part 18 (Two scripts that share ten digits can be trouble)
part 19 (In honor of International Mother Language Day...)
part 20 (Yes, it's Bangla. Not Bengali!)
part 19 (In honor
part 21 (The Windows 8 Hijripalooza extraordinaire!)
part 22 (Digit Substitution 2.0)
part 21 (The Windows 8 Hijripalooza
part 23 (Tamazight? Outta sight!)
part 22 (Digit Substitution
part 24 (I Adar you! Hell, I Double Adar you! - Windows 8 ed.
part 25 (Something old, something new, something repurposed, and
part 26 (Hey Windows 8, there's someone on the phone for you.
part 27 (No, the T and the H aren't silent...)
part 26 (Hey