Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
The other day, colleague Tom asked me:
Why would LOCALE_IINTLCURRDIGITS ever be different from LOCALE_ICURRDIGITS ?
This question is a lot more complicated than one might first imagine.
We'll start with the definitions:
You may want to read these definitions a few different times each, so you can try to understand the difference.
Did it work? Do you understand the difference?
Me neither.
Without explaining the difference between "local" and "international" in this context, it is not clear that they are any different -- or if so, how they are different.
Let's look for other clues.
If you look at the big list of Locale Information Constants, LOCALE_IINTLCURRDIGITS claims to be a part of Constants Used by GetLocaleInfo and GetLocaleInfoEx Only, while LOCALE_ICURRDIGITS claims to be a part of Constants Used by Both SetLocaleInfo and GetLocaleInfo/GetLocaleInfoEx. That's a genuine difference.
The CURRENCYFMT Structure, the NumberFormatInfo class, and the Microsoft Locale Builder
all have one thing in common: just one setting.
The underlying data store in locale.nls also has only one item in it, though the original source files still have two items. One just gets ignored when the file is being built.
At runtime, the only difference is that one only works for getting while the other works for getting AND setting. They are both using the same value (if you set LOCALE_ICURRDIGITS then LOCALE_IINTLCURRDIGITS is also changed).
So the difference is that there is no actual difference!
This was not always the case though, mind you.
If you go back to Windows XP and Server 2003, they were two separate items that could be different, and there were two theoretical purposes:
In practice, purposes such of these were represented in the data for some locales in Server 2003 and earlier -- Japan, for example did indeed have a LOCALE_ICURRDIGITS of 0 and a LOCALE_IINTLCURRDIGITS of 2. But this kind of difference vanished by the time Vista's locale data shipped. Note that formatting never used it, the documentation was never clear, and despite disappearing from everywhere as a distinct item there is a real lack of complaining about it.
Think of LOCALE_IINTLCURRDIGITS as an evolutionary blind alley in the development of Windows locale data.
And the documentation simply hasn't caught up to describng this vestigial tail (that catchup is slightly perilous for as long as it reasonable to use the docs for Windows XP/Server 2003 and earlier, unless the full scope of the setting is also documented).
Perhaps it just need a kickass KB article.
Or a kickass blog. :-)
The Sally Kimball Addition To The Dead Keys Conundrum: An Encyclopedia Brown Mystery
After I wrote up The Dead Keys Conundrum: An Encyclopedia Brown Mystery and Solution: The Dead Keys Conundrum: An Encyclopedia Brown Mystery (in response to issues first raised in Chain Chain Chain, Chain of Dead Keys), then in most cases in the world of Encyclopedia Brown the mystery would be solved and it would be on to the next mystery, in the next chapter (they usually seemed to come bundled in tens, if memory serves).
However, in at least one case, Sally Kimball, would actually outsmart Encyclopedia Brown occasionally, and be able to shed a bit of additional light when he was stumped by something.
He never minded this, since they were friends. She had already beaten up the bully who tried to beat Encyclopedia Brown up (Bugs Meaney), so if she occasionally proved herself to also be smart then he didn't mind.
In this particular mystery, The Dead Keys Conundrum, Van Anderson has (perhaps unintentionally) filled the Sally Kimballrole in the mystery, in a comment to the "solution" blog:
You say the only option you don't have is to throw away the keystroke itself, but would it not work to define all of your garbage sequences to NULL? I may be wrong - it wouldn't be the first time - but defining all your composites as U+0000 should leave no mark in the text stream, right?
I was initially skeptical -- not because it was a bad idea (since after all it wasn't), but due to fear of the undocumented and the fact that perhaps it would be a bad idea to depend on the behavior if it were to change (as undocumented behavior is occasionally wont to do).
But that answer feels a little unsatisfying. So I decided to dig a little.
I was doing a build of a depot and I was only 6% through refreshing my publics, ao any developer from Microsoft could tell you had a few moments.
First step was to create a keyboard with some NULL (U+0000) characters defined -- both as regular keys and as dead key results.
When I did this and tested out the keyboard, I found that in both cases:
Sound familiar, doesn't it? :-)
I'll give you a hint -- take a look at Short-sighted text processing #1: Uniscribe filters nothing.
This behavior of inserting nothing and beeping is how the behavior incorrectly attributed to Uniscribe is accomplished -- the EDIT control code replaces the text in the stream with a U+0000, which the underlying system refuses to insert, and beeps.
To be strictly accurate, the valid text is always null terminated, and thus in this case what is seen is a string of zero length which is ordinarily not expected. But the behavior is the same so that detail is not strictly necessary here. :-)
Changing this behavior would be a significant potential backcompat problem, and although the literal cause of the behavior (inserting a NULL into the input text stack) is not specifically documented, there is at a minimum some behavior that is using this undocumented underlying implementation detail to support a documented feature.
Supporting the documented behavior while changing the details of the undocumented behavior is problematic and likely not feasible. Plus you could unintentionallky break someone else's assumptions.
All of my prior, other complaints about the behavior I described in Short-sighted text processing #1: Uniscribe filters nothing (e.g. that text you display but didn't type that does show up) would not apply here -- we are talking only of keyboards.
It is true that MSKLC has problems with loading the keyboard I created in MSKLC to test the behavior above:
but this makes sense -- I had no idea about the behavior and so the code was never expecting this thing that I had no reason to expect. Had I known it then I probably would have not only recalled it when the mystery blogs unfolded but I would likely have written the behavior up years ago.
I never knew about it.
Now in the case of invalid dead keys, it is (admittedly) slightly unwieldy (as I mentioned in Solution: The Dead Keys Conundrum: An Encyclopedia Brown Mystery, you would have to define every type-able character on the keyboard in each dead key table), but the behavior of beeping and typing nothing is infinitely preferable to inserting the wrong character (or even worse two wrong characters) into text, when the user has no way to connect what they typed with what was inserted.
If anyone ever did their own input stack and they didn't handle the NULL the same way, then it is possible that they will break many keyboards beyond the fancy chained dead key one -- like the four Thai keyboards, for example. I'd be curious what WPF controls hosted in WinForms using Uniscribe do, for example. I assume they aren't screwing up the text or inserting random NULLs or someone would have reported that bug by now. :-)
Perhaps some future version of MSKLC could perhaps fix all of these problems/limitations/bugs):
Mentally I have halfway worked through how I might approach all of the above; if I thought there was a chance any of it would happen I'd write it up for either me to do or someone else to do. But that seems pretty unlikely (chained dead keys are just pretty esoteric, so even if there were MSKLC plans I'd imagine these to not be seriously considered).
Perhaps they could put MSKLC on CodePlex (something I had several people suggest to me last week) -- I'd likely contribute, in that case. I doubt they'd mind. :-)
In any case, I hope Van is not too offended by my analogy, since I really did appreciate his "Sally Kimball" role here that pushed me to give the better answer.
Which I think this often can be!
So the other day I was asked about Microsoft's lack of support for the Moldavian currency.
There is a very good reason that no such support could be found.
Because there is no Moldavian locale.
And there is a pretty good reason that no such support exists.
Because there isn't a Moldavia any longer.
I could get into it but I do not have as much detail as Wikipedia provides:
Moldavia (Romanian: Moldova), is a geographic and historical region and former principality in Eastern Europe, corresponding to the territory between the Eastern Carpathians and the Dniester river. An initially independent and later autonomous state, it existed from the 14thcentury to 1859, when it united with Wallachia as the basis of the modern Romanian state; at various times, the state included the regions of Bessarabia (with the Budjak), all of Bukovina and (under Stephen the Great) Pokuttya. The western part of Moldavia is now part of Romania and the eastern part belongs to the Republic of Moldova, while the northern and south-eastern parts are territories of Ukraine.
Perhaps the person asking the original question actually meant Moldovan.
If that's the case, we have another problem.
There isn't Moldovan currency support, either.
But there is a very good reason for that.
There isn't a Moldovan locale in Windows.
Unfortunately, there isn't quite so good of a reason for that....
True enough.
No request, no data. Though there is a subsidiary I think, and a subweb (see Microsoft Moldova here, contrast with Microsoft Romania here). I guess no one got around to asking for it yet.
Michael, I hope you did not write an entire blog that provides no hope whatsoever.
There is some hope, actually.
Factlet: the language (Moldovan) is pretty much Romanian (more or less), and their currency is the Moldavan leu (MDL), which is not exactly the same thing as the Romanian leu (RON).
Since Moldovan - Moldova (ro-MD, though the [deprecated] mo-MD could technically also be used but I'd fear for interoperability issues) is largely considered to be like Romanian, a custom locale could be created by taking ro-RO and making minor changes to it to take care of the region specific and currency specific stuff. And the formatting appears to be the same as well...
Which points to why there may not be huge rush to support a locale, since immediate needs are met? This is just a guess on my part, but it seems plausible. I haven't heard tell of calls from Moldova, at least....
Why didn't you say so in the first place?
Because the initial question referred to a place that hasn't existed for over 150 years! The "hope" is only hope if they meant something else entirely....
If my Adventures in Insurance annoy you. then you can skip this blog....
Regular readers may remember something of my insurance travails in getting my iBot battery replaced.
If not, you can go to Why not engineer inefficiency?, aka "Forget to charge my battery and take me to Jabba now" the iBot said to see the whole story about how Premera Blue Cross paid only $24.09 on a battery that costs $1100.00.
Basically, they billed it with the code 2008 HCPCS K0108 (the text description for this code is "Wheelchair component or accessory, not otherwise specified"). This sounds like maybe they replaced the cigarette lighter if it had one or the valve stem cover on one of the tires.
Note to Blue Cross: if you are billed $24.09 for the valve stem cover they are bilking you.
But I was not bilking here.
Finally, after claiming they weren't receiving all the right paperwork many times (at this piont several copies of my appeal are floating around over there; one day they may double or triple pay me if they keep track of the money as closely as they keep track of the appeal paperwork -- note that I will give them their money back since I'm not a thief.
Anyway, here is the letter (click on it to see it bigger if you want):
Text in case you can't see the letter above:
This letter is to inform you of the outcome of the appeal you submitted requesting additional reimbursement for the wheelchair batteries you received on August 17, 2010 from Depuy Orthopaedics Incorporated. After a comprehensive review of the appeal, Premera has concluded that the wheelchair batteries are eligible for additional payment under the Plan. The reasons for the approval of your appeal are explained below.Appeal ProcessTo evaluate this appeal, Premera included the participation of a Medical Services Administrator (MSA). The MSA reviewed your letter of appeal, all relevant information, and the terms and exclusions of the Microsoft Live Well Premera plan.Reasons for Approval of AppealThe Microsoft Live Well Premera plan allows benefits for charges to repair or replace covered items when worn out by normal use. The reviewer concluded that the description of this particular item is not comparable to any durable medical equipment or supply item procedure code description currently in place that has an established allowable charge. Based on this information the reviewer determined it is appropriate to allow the wheelchair batteries you received from Depuy Orthopaedics Incorporated replacement at the billed charge.We are having the claim for the wheelchair batteries received on August 17, 2010, reprocessed to reflect the outcome of this appeal review. Benefit coverage is subject to eligibility, the terms of the Microsoft LiveWell Premera plan benefits and all contractual limitations and provisions.
Note that they have not paid yet: they will now process it, and hopefully cut me a check at some point.
Since the batteries were received last August 17th and I paid for them about a week before that, we are on track to see me get paid at the one year anniversary.
Of course Blue Cross has had use of the $1100 for that whole time but I will receive no interest or allowance for inflation for this loan that I have given them while they bumbled paperwork.
I suppose I should say thank you to the thievery.
Now I did happen to call them to ask whose job it should be to request that a code be added -- I mean, given the problem here doesn't that make sense?
I was told that it would not be Blue Cross; they are merely denying the submitted code and it would be the people who want to be paid who would do that.
But I'm the one who wants to be paid! Do I have a way to get a code added?
Apparently I do not. Only a provider of services or equipment can do that.
Of course in this case the provider doesn't really do that sort of thing because they are closing up shop in a few years and the billing insurance is just a convenience thing: they make me pay upfront for everything.
So they have their money.
Of course eventually I will need a new battery again.
At the rate they are going, I'll be ready for a new one soon after they pay for the old one!
Then Premera Blue Cross will just have to go through the same rigmarole again.
The form to submit a claim has no space for "special appeal claim # for the last time the very same claim was paid off" which means another appeal, hopefully a little faster next time.
I think the purpose here is to get people so pissed that they just give up. They'll spend hundreds (if you consider the hourly of all the at least 9 people who've had to deal with this over the last 2/3 of a year) to avoid paying me $1000.
And we wonder why insurance doesn't keep pace with inflation?
You know, I could start my own insurance company too. It's just that I'd hate having to feel so unclean every time I'd have to pull this kind of crap that I'd have to be showering 12 times a day. My skin would chafe too much and it just wouldn't work.
I should feel grateful, but really I would like my employer to consider the fact that one of the reasons insurance costs surge is they paid for all of this non-streamlined stuff. You don't think it will be reflected in their rates.
I don't even get it - they pay for the serviecs stuff with minimal probing, and then nickel and dime the device stuff? Geez.
Microsoft could probably hire a nice gaggle of insurance administrator types whose full time job is to intelligently process all of this. They could "self-process" their self-insure, and it would be cheaper than the overhead of what they do now. All they needis a group that they control the charter of....
For info on what happened the next day and the day after, see From Bunnarchy with Santa and Jessica Rabbit to Anime Unleashed, published minutes ago.
Friday.
I went to see The Agony And The Ecstasy of Steve Jobs with the Crew at the Seattle Repertory Theater.
Oh. My. God.
This show blew me away. Completely.
As Scott Weaver mentioned after the show: it wasn't really a play.
He is technically right. This 1 hour and 43 minute monologue by self-identified Apple Fan Boy Mike Daisey with no intermission was not a play.
But it was theater, and it was drama.
It did not bring Apple co-founder Steve "Woz" Wozniak to tears because he had something in his eyes.
This dramatic rendering of a situation that should shock anyone with a soul and make any person who uses modern technology (Apple or not) look at that technology different from the time they see the play and into the forseeable future.
And now I look at my Dell Latitude E6500 (not a MacBookPro) and my HTC Arrive (not an iPhone) and my previous Palm Pre (also not an iPhone) and my Motion LE1600 Tablet (not an iPad) and my Zune (not an iPod), I realized something:
By and large these are devices that are older and have not been upgraded since they were bought (the one exception there is the HTC Arrive which is a Windows Phone 7 device that is Microsoft subsidized or will be if I turn in the receipts in time). I do have a MacBookPro which is the one Apple product I have, and in the last 12 months it has been booted into OS X fewer than five times -- I know this because of how many updates I am asked to install when it does get booted that way rather than into Windows 7 via BootCamp.
I like all of them.
But I can't say I love any of them the way that people talk about how much they love their iPhones or their iPads.
They are just tools to get stuff done.
I don't upgrade them because they get the job done. So I don't care that Dell has faster laptops no or that the tablet is out of warranty and doesn't have cool touch technology like an iPad or that I don't fawn over how I love the phone the way some of my friends talk about their iPhones.
I look at all of these devices differently than the way people who love their Apple products look at their Mac* and i* counterparts.
Because of this, though I do feel some sense of responsibility and some shame for my own part in the engineered human rights issues in Shenzhen and at Foxconn, I can say that at least I never lovingly stroked the pain of other human beings the way so many iPod and iPhone and iPad users have.
Scott and I were talking after Mike Daisey's amazing monologue (as I mentioned), and while my thoughts had still not fully formed (seeing it all is like an unexpected body blow that knocks you over wondering how on Earth he did it), I argued that we really ought to be doing something differently. After ten minutes of arguing how anything we do would cripple American business he stopped himself, largely because he didn't enjoy talking like a Republican!) and I continued talking with Scott's mom, who tried to draw out my opinions a bit because she seemed genuinely interested in understanding a little more about someone who was looking at the situation as something that had to change.rather than as something we were powerless to do anything about.
She mis-guessed my age though admitted it was what she called my "idealism" that led to the error. :-)
But really, I think something has to change. Mike Daisey's letter that we were given as we left the theater (click to see larger, or better yet see The Agony And The Ecstasy of Steve Jobs and get your own copy afterward -- that is how you will know what I am going on about!):
Something has to be done. And even though (as I said) I feel a slight moral superiority to those who are a part of that cult of constant upgrade and pleasure of their devices, I still feel responsible for my own contribution to the problem.
A problem largely caused by all of us, by the generations that came after our own industrial revolution as our own companies bought billions worth of stuff from those who took the Special Economic Zone that is Shenzhen and companies like Foxxconn and made them into what they are.
More directly, Foxxconn recently reported that their owner's revenue was up 26% in February thanks to the iPad 2 and the Kinect. Okay, I get no personal pleasure from the Kinect, as I mentioned in The bizarre variation of a skeleton that is iBot + me, to a Kinect. Though I did say I thought the feature was cool. Thus I was a little more involved here than I even realized even I had to pan it personally since I can't use it.
Note to Microsoft Employees: the Kinect has much that comes from FoxxConn; we're on the hook here too. If Apple doesn't want to be the market leader that Mike Daisey imagines, maybe Microsoft should step up here and be clear that they are not really wanting the pleasure of those playing with a Kinect to be on the backs of the people working and living in tremendously bad conditions at FoxxConn, and in Shenzhen.
Now I don't want a Kinect, just like I don't ever plan to want the iPad I didn't want anyway. I couldn't play a game knowing the people who put the cool device together have neither worker's compensation for injury nor disability, that people as young as 12 are working longer days than the most diehard Microsoft employee so that we can enjoy Dance Studio and the like.
The future me that was going have the opportunity to be excited about a better Kinect has just suffered a mortal wound that may never heal.
You love your iPad you just upgraded to? Thank some person you don't know with a crippled hand due to lack of worker's compensation (who was fired afterward) or some 13 year old who never gets to go to school who worked a 14 hour day you don't know an so on, all of whom worked so you could get your iPad 2 to you as soon as possible. I wonder if you love them too. Funny way of showing it.
But riddle me this -- I wonder how will you feel the next time you use it. I wonder how you feel if you are reading this blog on that iPad right now.
So I'm not as bad as the iPad/iPhone people?
I really don't give a *** if I'm not.
I need to do better here.
So does Apple.
So does Microsoft.
So do we all.
This last weekend extremely unusual.
Perhaps not the single most unusual weekend of my life, but in the top five, certainly.
I'll explain in reverse order why this is the case.
It is a tale of joy and pain, of the drunkests of debauchery and the soberest of reflections
First, there was Sunday, and Sakura-Con. The huge anime conference at the Seattle Convention Center. Here I am with one of the Domo dudes who did so well in the Ballmer "Developers" mashup:
It was pretty amazing, and all I can say is that it is even more amazing in an iBot. To quote Ferris Bueller, "if you have the means, I highly recommend it." Everyone was amazed, and by my somewhat rough calculations (I didn't have a slide rule on me) I am ~8000% more likely to have a woman or lady or girl look at me and say something akin to "that's amazing!" when I have the iBot than back when I did not. I still don't know exactly how to convert this and get to stage 3 but I'm working on it....
I also had a father there offer me $100,000 for my iBot (I've gotten similar offers before). This guy knew it was no longer being made and was interested for his brother's sake. I politely declined but gave him a promise to follow up with him on several "save the iBot" efforts and some other resources. Two 14-year-olds who witnessed me turning down $100,000 thing were in shock....
Maybe I'll talk more about the new schticks I added to my 'explaining the iBot' repertoire, another day.
For now I'll move off of Sunday and go back in time to Saturday.
Saturday was BUNNARCHY!
Somewhere between 500 and 1000 people took part in a 12-hour themed bar crawl, with everyone dressed as bunnies.
Oh yeah, also several rabbit accessories (carrots!) and
one of several Jessica Rabbits (this one played by new friend Erin Bohlmann -- as she'll tell you: don't hate her, she's just drawn that way!) and one undercover Santa Claus (who plans to be an undercover bunny at SantaCon later this year)....
Perhaps you have never found yourself in a Zorro mask and Santa hat riding an iBot with 500+ drunken bunnies somewhere on 1st Avenue in Seattle for half a day.
Let me tell you, if you have not, then you really need to consider trying it!
My Santa logic was based on a South Park -- one where Jesus was mad that he had like two Christmas songs that were about him, and all the rest were about Santa. I consider this bit of costume trickery (assisted by goodest friend Kristi Grinnell) to be Santa's way of pointing out that he gets no coverage during Jesus's big "Walk like a Zombie with a soul" day....
That entire day rocked from beginning to end, all 12+ hours for those who made it the whole way from breakfast at 10:30am to the end of the walk in Belltown.
Which brings us, inevitably, to Friday.
The text from this section was snipped and will appear in a blog within 10 minutes of this one. You'll see why when you read it.
So like I said, a quite unusual weekend....
Yesterday I wrote The Dead Keys Conundrum: An Encyclopedia Brown Mystery about how I figured out a way to solve several longstanding problems with keyboard layouts that had been considered by design limitations years before I was working on keyboards.
In the blog, in the style of Encyclopedia Brown the teenage detective, I provided all of the clues that led me the solution in the hopes that one the 1000 or so page views would represent a reader who (upon knowing there was a solution and knowing the clues to the solution were there) would provide the solution.
Just like how we all knew that Encyclopedia Brown's cases always had a solution and thus if we could figure out the clues then we too could solve the mystery.
No one rose to the bait and provided even a guess, so perhaps my goals were unrealistic. But in any case, I will now explain the solution....
From the third problem listed in The keyboard does not do what I tell it to!:
One more -- similar to the last one but with a happier ending
This has been bugging me for months. I am not sure when it started, but any time I try to put an apostrophe into a document, nothing happens. Then if I hit the key again I get two of them. I have to hit the backspace key to get what I wanted. So it takes three keystrokes to get me what should have taken one. Is this some sort of virus? Help!
This has been bugging me for months. I am not sure when it started, but any time I try to put an apostrophe into a document, nothing happens. Then if I hit the key again I get two of them.
I have to hit the backspace key to get what I wanted. So it takes three keystrokes to get me what should have taken one. Is this some sort of virus? Help!
Ah, no virus this time. However, it turns out that this person had installed the "United States - International" keyboard layout. This layout has the apostrophe as a dead key for an acute accent. And as I have said before, dead keys are not intuitive. In his case either the apostrophe and a space or uninstalling the layout were both okay options. He chose the latter since he did not need the international layout....
The dead key table of the APOSTROPHE on the US International keyboard is:
so when you hit APOSTROPHE nothing happens but it waits to see if you type in one of the character in that BASE column; if you do then the character in the second column appears. If you do not, then you get the two characters that didn't go together at once -- next to each other. Thus for the United States - International keyboard, you get two apostrophes.
If you wanted to fix this problem of the two characters appearing, all you have to do is remember the principle: this only happens if the keystrokes are undefined. Thus all you have to do is add an entry here to convert it (in this case by perhaps adding U+0027 as both Base and Composite characters -- so typing two apostrophes in this case gives you one apostrophe.
Now perhaps in this case it is just a workaround, but in other common cases where the user might expect a combo to work, you can make it work right -- it's a fix.
Another example might be in order.
Let's take a keyboard that provides the GRAVE ACCENT as a dead key for A/E/I/O/U.
The beginning of the dead key table is obvious, but then perhaps you don't want GRAVE ACCENT + LETTER Q to show up a `Q, and so on.
You can then set up the table like the following:
to have the last character you typed be the only character that shows up (as if you were filtering out the illegal combination by removing the bogus diacritic.
Or you could go in a different direction, such as converting the bogus combinations into a space:
Or you could be really outrageous, and make it a backspace:
The only option you don't have is to throw away the keystroke itself (the backspace is a very aggressive approach since it removes the previous letter -- a nice user hostile interface. :-)
Anyway, you get the point -- if you don't like two characters popping up on bogus combinations, then all you have to do is define the behavior for the bogus combinations.
Kind of makes sense when you think about it -- and an entirely natural thought progression (the way to guard against "undefined behavior" is to define it).
Now in the end this piece of it is a parlor trick. I mean, even the ugly behavior isn't entirely strange if you ignore the case where you don't know it was a dead key. I mean, the two characters you typed are right there -- so maybe the old behavior isn't so bad!
So let's move into an area that is slightly more interesting, shall we?
Let's say you live in Finland and you are a huge advocate of the Finnish standard keyboard they created. The one that not only lets you type names from any EU language but also lets you type in other languages like Vietnamese (reportedly due to the immigrant population in country).
Now if you wanted to type a letter like
ậ
aka U+1ead (LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW)
Perhaps you would want to be able to type it like
CIRCUMFLEX + DOT BELOW + LATIN SMALL LETTER A
because you live in Finland and have been using dead keys for as long as you have been typing.
Well now that you have read Chain Chain Chain, Chain of Dead Keys, you know how to chain together the two dead keys to get to the one character.
But then we hit the problem -- you need there to be a CIRCUMFLEX + DOT BELOW character to stick in the table. And there isn't one in Unicode.
Perhaps you jumped into the idea of just using as PUA character. I mean, you convince yourself that you'll be adding the 150+ valid combinations and since you are defining all of those valid sequences no one will ever see the PUA characters standing in as pseudo characters for HORN + GRAVE and CIRCUMFLEX + TILDE and so on.
After defining the many dead keys, even solving the problem of Getting intermediate forms problem of canonically reordered sequences with above and below diacritics entered in the wrong order by automatically mapping them both to the right character, you then remember that any time you type an undefined sequence, the UTF-16 code points that you defined in the table get inserted.
And these pseudo characters you added as PRIVATE USE AREA characters might get inserted too.
I am definitely not a fan of putting random PUA into the world -- especially to define things that the user did not define themselves.
But didn't we just solve the problem of dealing with how combination not defined in the dead key table are represented? Yes, we did!
For the cost of adding every single character in every shift state of the keyboard to the dead key table of every single dead key, you can create a humongous keyboard layout that guards against PUA leakage completely!
In fact that only time the user has a hint something is going on is if they are watching WM_DEADCHAR message. But since you define the name of the key (remember how I told you to always define it!), you can make sure that a really inquisitive mind trying to understand the WM_DEADCHAR results will get their explanation from GetKeyNameText.
Now of course this still doesn't resolve the Vietnamese/Finnish problem completely, given the Harder intermediate forms of characters that are still going to be out there, that require more than one code point since no entirely precomposed character exists. but thankfully these cases are very rare (and not supported by the bulk of the various Vietnamese code pages, either).
In any case, in the less extreme cases you can now use chained dead keys when you need to in order to get the result you want....
Thanks Encyclopedia Brown, for solving another mystery!
It seems I may have inspired myself a little bit.
Let me explain.
It started when I blogged Chain Chain Chain, Chain of Dead Keys almost a week ago.
No, hang on.
It actually started some time during the other ten blogs cited in that blog written over the last half decade or so.
When I started talking about dead keys.
The limitations about dead keys have been made clear.
Like the fact that they can only be used to produce a single UTF-16 code point at a time. And each piece of the dead key (the initial character/base character/composite character) contains exactly one UTF-16 code point.
Even when you chain dead keys together as I described how to do in that last post, you have to include valid UTF-16 code points during each link of the chain of dead keys.
There are other not entirely intuitive behaviors that dead keys bring to us -- like if you type keys that are not defined in the chain -- it will give up, and show both the character that is partially typed at the start and the character that you typed right after that (this is in fact was causes the problem in the third example I gave in The keyboard does not do what I tell it to!
Anyway, after I wrote that blog on Saturday, I looked at all of this information and realized I thought I might have found solutions to some of the common problems I have brought up in the past:
I tested my solution out, and I was correct -- I solved the two problems that no one else had ever solved before.
Of course I didn't come up with it five years ago, so I'm not going to sprain my shoulder patting myself on the back!
In the spirit of Encyclopedia Brown, can anyone describe the solution I found?
I'll provide the answer tomorrow, either giving credit to whoever finds what I figured out, or just to me. And I'll describe the solution, of course. :-)
Earlier this year, Kenji Crosland guest wrote a blog over on Brave New Words titled Subbed vs. Dubbed: Where do you stand?.
The article is a good read and goes through many of the issues related to the differences behind the scenes in translations of films via subtitles vs. dubbing the voices (subbed vs. dubbed):
One interesting factor about the piece is that Kenji included his own preferences on the matter:
If I had a choice, I’d probably choose subtitles over dubbing most of the time. In some movies, however, the subtitles can be so distracting from the action that you’ll spend more time reading than actually watching--especially when you don’t know one word of the original language (I had this experience with “Crouching Tiger, Hidden Dragon”).My guess is that most people prefer subtitles to dubbed movies. I’m wondering, however, if the vote wouldn’t turn out differently if more time and money was spent improving the quality of dubbed films. So what do you think? Subbed or Dubbed?
The interesting part of that question in the context of the web is that if you think of each country/region as a separate market, clear trends pop up on an almost per-market basis which is most commonly preferred.
This begs the question a bit since it is just as likely that what someone sees the most of is what they prefer as saying that the general user preference drove the original "decisions" to prefer one over the other.
One would not only have a hard time discerning which answer is "right"; one would probably be forced to admit that both factors play a role. This makes the overall question "which do you prefer?" a little suspicious since the question is so hard to separate from where you live and what you see most often that not mentioning it leaves one unable to determine how carefully the issue was considered.
As a matter of course I always turn on closed captioning on my television, so even my English movies have the English words right there. and I notice everything -- the mistakes that maybe were seen years go but never corrected, the times that the person doing the captions might have been basing the work on the script while the actor changed the words said, and the times that things are not included to avoid the "too much text" problem. Although the two technologies (closed captions and subtitles) work to do two very different things, they share some of the very same issues in terms of the compromises (without the choice of dubbing!).
And inevitably that thought leads me back to the subbing vs. dubbing question, and the fact that it is so seldom an issue that one gets to make a choice on personally, even if the question is asked that way. I'd like to see the results of a complex study into the differences per market in which is done and why. And although I wonder what people think I know that a blog is probably not the way to attack the problem with any kind of scientific accuracy.
Which do I prefer between subbing and dubbing? Whichever one someone took the time to do for what is on the screen, of course. :-)
If you look at two different blogs of mine about fonts on Windows:
and you kind of combine their messages a bit -- one about the general tendency to really try to move to the new ClearType fonts for east Asian languages, and the other about all of the HKSCS-specific font work done for Hong King -- you might notice something.
They are in direct conflict with each other.
I have spent some time trying to understand how important this alternate form work for Hong Kong is. How crucial it is for these 4720 Han ideographs to be different in Hong Kong.
Of those I talked to, not even the people who recognized the differences thought it was all that important that if (in the long run) the effort to move to the newer style fonts is successful that the HKSCS-specific work will be maginalized across the Windows user interface.
It felt to me like perhaps something was being missed, but apparently this is all okay.
Now note that these two different "models" for Traditional Chinese font choice have other important differences too. They both depend somewhat on the default system locale since they use the GDI font link chain that is modified by that locale setting. But the new model does get rid of the HK-specific behavior, which feels even a little more hacky when one really thinks about it.
Perhaps if I ran across someone who felt like the change would be suboptimal for Hong Kong I'd feel differently. But I guess it really wasn't that big of a deal for people in Hong Kong.
Though it does make one wonder if it was worth the time/money to create the HKSCS-specific fonts, though. Does the typical or even the exceptional user even notice? Will they notice if the difference largely goes away?
More questions than answers today....
Reactions to that video of the Unicode characters I talked about in 49571 ≈ 2π Sympathetic Characters, which I believe proves truth ≈ beauty were plentiful.
But regular reader Andrew West was not impressed. His reaction in a comment to that blog on the same day:
It would be cool if it did show every single graphic Unicode character, or even all characters in the BMP, but it is 5,000 characters short of what's currently in the BMP, and seems not to include anything much encoded during the last ten years.
Then, a few days later, he decided to try and do a better job, in his own blog titled Unicode 6.0 — One character at a time. From his intro:
A recent youtube video by jörg piringer that scrolls through "all" 49,571 Unicode characters in 33 minutes and 16 seconds (25 characters a second) has been doing the rounds, but I'm afraid that I was not impressed. The 49,571 characters in the video only cover the BMP, and even then it is 5,000 characters short, missing out most of the characters that have been added to Unicode over the past ten years, and missing out entirely some scripts that have been in Unicode since Year Zero.
Unicode version 6.0 (released October 2010) actually defines 109,384 characters (109,244 graphic and 140 format characters). How many of them you are able to see depends upon your operating system, your browser and whether you have additional fonts installed covering obscure and recently encoded scripts and characters (and whether your browser will actually apply those fonts or not). On my Windows 7 SP1 machine, with no additional fonts installed, I can see 95,372 of these 109,384 characters (87.1% coverage of total number of characters, but only fully covering 66 out of 203 blocks, and 85 blocks with no coverage at all).
Now that says it all for me, on several levels.
And then his "video" is pretty amazing too. Check out his blog, and also the JavaScript page. You'll see what I mean.
Lessons learned:
Thanks, Andrew! :-)
This is a blog about the Naira. Cue gratuitous Naira graphic here:
From time to time I have talked about currency symbols and times that they were not fully covered by Windows -- for both good reasons (e.g. they did not exist when Windows was released) and bad ones (e.g. no one mentioned it).
From the recent change to add the new Rupee in India (Rupee! Rupee! Let down your CHAR!) to the unused Tenge in Kazakhstan (It is with a Tenge of sorrow that I say this) to the unused Guarani in Paraguay (The elusive G- sign said to exist in South America may not be in Windows, says a customer who has hunted for it), there are always current signs that we end up a little behind on.
Well i guess we can say that there is perhaps another one.
Where?
Well, in Nigeria, there is the Nigerian Naira.
It has a currency sign in Unicode that has been around at least since Unicode 1.1.
DerivedAge.txt in the Unicode Character Database doesn't go back to 1.0 and I'm too lazy to look it up in the book, so we'll call it 1.1.
Okay, hang on, that sounds pathetic. I'll check, I have the Unicode 1.0 books on the shelf in the other room.
Aha, I'm back! And this currency sign is in Unicode Name Index of Unicode 1.0, on page 666 of volume 1 of the standard.
Don't even get me started on the 666 thing, or the fact that I'm apparently not too lazy to go dig up an old book reference but I am too lazy to fix up the text to account for the decision to go look it up after all, and being initially wrong in my guess!
Anyway, since Unicode 1.0 it has been ₦ (U+20a6, aka NAIRA SIGN). This guy:
And an ISO 4217 International currency designator of NGN.
That currency sign is in all of our fonts, so everything looks good, right?
Well, not entirely.
You see the three locales in Windows for Nigeria have a LOCALE_SINTLSYMBOL (the ISO 4217 value) of NIO-- which is actually the designator for the Nicaraguan Cordoba oro.
And the LOCALE_SCURRENCY?
Well, just read 'em and weep:
Yes, that's what it looks like. They all have LATIN CAPITAL LETTER N there.
Um.
Yes.
No.
Really.
It's like the Tenge thing or the Guarani thing all over again!
We should probably fix those -- all three of those locales, for both LOCALE_SINTLSYMBOL and LOCALE_SCURRENCY.
I suppose one could look on the bright side: at least the LOCALE_SNATIVECURRNAME and LOCALE_SENGCURRNAME are correct for all three (yo-NG, ig-NG, ha-Latn-NG). It's a start....
Over in the Suggestion Box, Van Anderson asked:
I've always wanted to know how to chain dead keys. Since you can't do it in MSKLC, this seems like something that should be out there for the public at large. Is it as simple as altering the KLC file by adding an '@' after a composite in a dead key list, then giving it its own dead key list? Do you have to do a command line compile with special flags? Basically, I'm looking for documentation on how to go about doing it for myself, seeing as you were unceremoniously taken off of MSKLC development before you could do everything you wanted.
Now I have mentioned chained dead keys many times in the past:
But I have never truly described how it is done, though I have left broad hints about it -- hints that Van picked up on and clearly was hoping I would perhaps "finish the job". He even supplied some potential motive for me to do so, the reference to how the MSKLC project was transitioned and buried....
Well, I don't require that additional motivation. :-)
But I figure it might be nice to finally put it all out there and explain how it it is done.
And so, without further adieu, I will explain how to chain dead keys on Windows.
First the gratuitous videos for the Chain of Fools song:
Okay, with that crucial bit out of the way, I'll continue. :-)
Now first I'll review some of the consequences of what I am about to tell you:
Okay, the simple steps, now:
Step 1: Create a simple keyboard like below in MSKLC:
Be sure to put both the upper case and lower case letters on those two keys with letters on them. and do the key that is a dead key just as given above.
Step 2: Build the keyboard from Project | Build DLL and Setup Package.
Step 3: Save out the .KLC file and open it up. If you look at the file, the LAYOUT and DEAD KEY sections should look as follows:
LAYOUT ;an extra '@' at the end is a dead key//SC VK_ Cap 0 1 2//-- ---- ---- ---- ---- ----16 U 1 u U -1 // LATIN SMALL LETTER U, LATIN CAPITAL LETTER U, <none>18 O 1 o O -1 // LATIN SMALL LETTER O, LATIN CAPITAL LETTER O, <none>28 OEM_7 0 00b4@ -1 -1 // ACUTE ACCENT, <none>, <none>39 SPACE 0 0020 0020 -1 // SPACE, SPACE, <none>53 DECIMAL 0 002e 002e -1 // FULL STOP, FULL STOP, DEADKEY 00b400b4 02ba // ´ -> ʺ0020 00b4 // -> ´
As the file says, the @ on the end of 00b4 indicates it is a dead key -- and thus the later dead key table for 00b4 that is being pointed to.
Replace the DEADKEY table there with the following:
DEADKEY 00b4 ; combinations with Acute00b4 02ba@ ; Acute + Acute -> Double Acute (Modifier Letter Double Prime)0020 00b4 ; Acute + Space -> AcuteDEADKEY 02ba ; combinations with Double Acute (Modifier Letter Double Prime)o 0151 ; Double Acute + o -> Small Letter o With Double AcuteO 0150 ; Double Acute + O -> Capital Letter O With Double Acuteu 0171 ; Double Acute + u -> Small Letter u With Double AcuteU 0170 ; Double Acute + U -> Capital Letter U With Double Acute0020 2033 ; Double Acute + space -> Double Prime
Save the file.
Step 4: Build the four DLL files, replacing the files already built.
To do this on my 64-bit machine, I ran the following command line manually from the directory that contained the modified .KLC file and then hand-copied each DLL after it was built to the relevant directory of the setup built in Step 2: (you should substitute the correct path to MSKLC for your install):
C:\Users\michkap\Desktop>"\Program Files (x86)\Microsoft Keyboard Layout Creator 1.4\bin\i386\kbdutool.exe" -u -v -w -x layout05.klcC:\Users\michkap\Desktop>"\Program Files (x86)\Microsoft Keyboard Layout Creator 1.4\bin\i386\kbdutool.exe" -u -v -w -m layout05.klcC:\Users\michkap\Desktop>"\Program Files (x86)\Microsoft Keyboard Layout Creator 1.4\bin\i386\kbdutool.exe" -u -v -w -i layout05.klcC:\Users\michkap\Desktop>"\Program Files (x86)\Microsoft Keyboard Layout Creator 1.4\bin\i386\kbdutool.exe" -u -v -w -o layout05.klc
Of course if you are automating things you can do the copy after each step.
Just remember that the file can't be opened in MSKLC any more!
Step 5: Install the setup package on a machine.
You will now be able to type
´´u´´U´´o´´O
to get
űŰőŐ
and if you ever type anything other than u, o, U, or O for that third character, you will see the ″ (U+2033, aka DOUBLE PRIME) put in for those two acutes.
One last note, you should probably add names for all the dead keys too, they are in a table near the end of the .KLC file:
KEYNAME_DEAD00b4 "ACUTE ACCENT"
You should add one line for each dead key, even including the ones that can never be produced (since they are still returned by WM_DEADCHAR type messages). Things will still work if you don't do this, but it is a bad practice. For the sake of completeness, I would highly recommend this last step.
Now, there are some interesting things to note here, as you plan out how you might choose characters to use in your own chained dead key containing keyboard:
And there you have it: how to chain dead keys in Windows. Hope that helps you out Van. And everyone else! :-)
In the world of spam and malicious code (I am treating it as one world for the current purposes though it is really two separate, somewhat overlapping worlds), whether it is the code in spam/malicious e-mails or the code in spam/malicious web sites, the bulk of the initial efforts were not very supportive, internationally.
Now in my heart of hearts I like to see code done correctly.
But I am hardly going to go in to the spam business
Insert gratuitous Monty Python Spam video here.
I'm sure they might pay well, mainly because I think I was almost offered such a job once. It's just that while the person was feeling out my interest my disdain for the efforts was not as veiled as it might have been.
Wes Miller mentioned the other day in his Spamsplosion blog:
So the irony here was that I had to actually switch to using Unicode everywhere because spammers (in addition to doing some pretty conniving stuff to get into your inbox and get you to click on things), are actually sending messages in Unicode, most likely not to enable localized text, but to evade spam-sniffing tools that (ahem, like mine) blow up or skim over when Unicode text comes along.
That is probably part of it, but there is more to it than that.
Lots more tools support Unicode now than did before, and UTF-8 is so often the default in these tools now.
And I mean there are many spam sniffing tools that look for code pages being used that the reader would not expect, as well.
Plus there is spam starting to take advantage of lookalike characters, something much easier in Unicode since there are more of them (and no matter how smart Internet Explorer gets in this area for IDN, email programs are not all doing as great a job on the random text in email front -- so filters looking for the names of erectile dysfunctions drugs may or may not know to look for the math alphanumerics and so forth).
There is a part of me that really wishes for more appropriate uses of ISO-10646 than support of spam. Maybe I should form PE-UU (People for the Ethical Use of Unicode). Or maybe I should work on the name some.
Sinnathurai Srivas is at it again.
You may remember when I talked about another recent episode, in From Tamil puns to Gallagher clips, in ~10 years.
This time he still going on about Tamil grammar and its scientific nature. And how UC (the Unicode Consortium) was destroying it.
But something different this time....
Allow me to quote from the thread on Unicode's Indic list:
Tamil grammar in one of its rules categorically states that if elongated vowel isrequired add the appropriate vowel/s.ie for example if kiii is required in Tamil k+ii+i > ௧ீிwhen it comes to using dependent vowels (name is questionable) it should not berestricted by dotted line.For example a Tamil name, Eurotamilini is in use (similar to Indiraani,lankeshwaran, Eelavar) and it can be written as it should be, as Grammar permitsit.Denying this facility or suggesting complicated ways to represent it are wrong andit should be reversed.(Not I also noticed in Viki Hindi page, Hindi also using longer than long vowel i.)So it looks like forcing dotted circle is wrong, wrong according to Grammar andwrong compared to, for example English etc..Can we deprecate the dotted circle?Auai is a name. why attempting to block it.varuuum is a typical Tamil word, why attempt to block it.Sinnathurai
Peter Constable, colleague and brave man, jumped into the breach with a dose of reasonable inquiry. He knows what is going on but he knows that whether Srivas is right or wrong he doesn't ever tend to raise issues in an actionable way. And so if there is something genuine the answer must be teased out:
You don’t say what products are blocking k+ii+i.But how would you expect k+ii+i to display? The two vowels overlap and becomesomewhat illegible:Re “Auai” and “varuuum”, what are the Unicode character sequences you suggest, andhow do you think they should be rendered?Peter
These are not idle questions Peter raises here. One can praise rendering of a generative model for shoving random diacritics in Latin all you want but if What do you get when you combine a base character with a buttload of diacritics? proves anything, it proves that text that is based on randomly string stuff together is certain to look worse, versus text based on known expectations that someone engineers plan for.
This is kind of a strength that the contrasting Indic model provides -- it handles what it knows how to do, and it assumes the rest is invalid. If all your text is valid and the typography folks know the full set of valid combinations then you never have problems.
Now the contention that Srivas has here is that there is a missing case, which is (if true) useful information.
But he provides no images, no citations. If you want to see how bad text looks if you just shove random crap into the frame, see that buttload of diacritics blog. The information Peter was asking for is not optional.
Srivas replied back to Peter's responding mail:
Hi Peter,Some details before answering to your question.When printing technology was introduced in India (by western missionaries, ratherlike Unicode now), some changes were made to the way the extra long vowels wererepresented.the extra long vowels were/are being written normally in the formconsonant+long_dependent vowel + short-independent vowelWhen pronouncing many make the mistake of splitting/breaking the extra long vowel.(I have proof even within educational CDs/DVDs on this.)Actually these are long vowels are supposed to be elongated in time not broken asseparate sounds.There are occasions in which breaking as components also required. (This I found asexample in representing foreign names. for example,The name: Kori Akiinoo is normally pronounced Kori AkIInoo, where the IIno supposedto start a new word, but because it is a name, is written without space.Anyway, in normal Tamil it is extra long vowel, bout components of different sounds.We are using the correct form in special circumstances and continuing with theincorrect method for normal use.So to start, I have ways of correctly representing the correct requirements usingfont technology. In basic form the glyphs do not need to overlap. these are onlyrendering manipulations.However the dotted circles create problems as not all combinations require complexrendering, especially the extra long representation.FYI:The Tamil grammar rule that explains how to write the extra long mathrai/time (notmathra) is as follows,"If elongation is required, write by stacking/adding the appropriate vowel mathrai".The product:I;ve just tried the following in MS Word 2010 and it shows a dotted circle beforethe third mathrai. I pates it below in yahoo mail and it also shows dotted circle.Tobe sure, please use a font that actually has the dotted circly glyph.வரூும்RegardsSinnathurai
Now notice that he did not respond with any of the information that is needed here. At least he might have noticed that (he was providing information "before answering" -- he was not answering.
At this point Shriramana Sharma, a colleague of mine from INFITT's WG02 replied:
Srivas, there is neither need nor justification to deprecate thedotted circle. If implementations display a dotted circle where theyshould not, it is a problem with those implementations and not withthe encoding. Peter has already informed me that the major softwareproducer he is part of has already taken efforts to ensure that allconceivably meaningful sequences are meaningfully displayed. If thereis any lacuna in this aspect, please provide appropriate screenshotsand I'm sure Peter will be glad to look into it.
This highlights another issue, one perhaps implied to people who know the situation, about whose "fault" the issue is.
You see, Unicode isn't the one defining the use of the dotted circle for every errant case. The fact that Srivas keeps blaming Unicode unscienticality is something that everyone grows tired of, so I'm sure someone else would have lost patience soon if Shriramana Sharma had not. But perhaps they would not have been as eloquent so let's call this one a win.
It i an implementation-specific problem and if the right info is provided then the issue can be addressed.
The accuracy was lost on Srivas though, who replied:
Sharma,The need is to write Tamil as in Grammar.The grammar states, to obtain elongated matrai, not short, not long, but extra long,stack as required.The combining vowel is for indicating it is combining/elongating matrai.The independent vowel indicates it does not kind of combine, but stands alone.The users hence, wrongly assume it as standalone.Further, I'm not sure if UC misunderstood the definition of Matrai.orSanskrit mislearnt about the matrai.Matrai does not directly mean combining vowels, but means timing. (both forconsonant and vowels.)As far as Tamil is concerned, UC has a mis understanding about Mathrai and trying toapply this may be the route cause of the error in decisions.I know you are keen on Sanskrit.Could you clarify the definition for timing/matrai and combining vowels in Sanskrit.I can then querry UC, the definitions of the same.The Tamil Grammar not only is correct, but also accurate in having the timingdefinitions.I want to know wether the problem created by UC or SK.Sinnathurai
Huh? I don't think he understood the question at all!
At this point, Vinodh Rajan jumped in reply:
Well Srivas,In Contemporary Tamil (you know the type I learned in School ) requiresthat for Extra-long vowels, the corresponding short independent vowel signbe added for each Mora.கூஉகோஒகாஅமீஇkūukōokāamīiThis is how things work in the mainstream Tamil. I was taught only this wayin School, and thats how Tamil books publish are being published around theglobe.Please don't push your fringe interpretations of Tamil Grammar into Unicodeand expect the consortium to change the rules for your own whims andfancies.Tamil Grammar does not mandate the use of Dependant vowel signs forextra-long vowels. Period.Before attempting to educate UC about Tamil Grammar, I suppose you must readthem properly in first hand.V
Now what is being underscored is the underlying truth about our man Srivas -- that he is inaccurate and perhaps in some cases untruthful much more often than not. The principles that the earlier responses had underscoring them was the assumption that there might be a valid issue that Srivas would help bring to light.
Even though everyone knew that if something actionable came out of the issue it would not be from Srivas, it would be from someone else who had the graphic, and the citations.
INFITT WG02 has done such changes in the past -- for example proving the need to support misnamed Aytham as a standalone character rather than as a combining one, since it is used in loan words as a de facto TAMIL LETTER FA. That was another case of a bogus dotted circle, though that omewas a side effect of Unicode character properties and how Microsoft used them. Thus when the property value changed and Microsoft picked the change up, the dotted circle went away.
ACTIONABLE feedback, to the right people if possible. I mean, Srivas complaining on a Unicode list about a potential Microsoft issue isn't quite the level of complaining to McDonald's about the Burger King whopper you just had, but it is enough of a disconect that one should endeavor to improve the aim when one learns of the new target.
By and large, it is the best way to assure getting a bug fixed....