Tuesday, November 01, 2005 7:15 AM
Michael S. Kaplan
I WON to talk about the YEN
(terrible pun in the title, sorry!)
It was early last month that I shook my head about the REVERSE SOLIDUS and stated for all the world to hear that I'd rather call it the path separator.
And I have been a strong advocate for keeping everything in Unicode, pointing out that code page 932 and code page 949, while being very useful for many purposes, are like poison for the YEN SIGN (U+00a5) and the WON SIGN (U+20a9), since any time they roundtrip through one of those two code pages they are converted to U+005c.
But you may have noticed that I palmed a card there when I tell people to keep it in Unicode to solve the problem, since the statement ignores the fact that neither U+00a5 nor U+20a9 are on the keyboards for Japanese or Korean users.
So while it is true that keeping it in Unicode will solve the problem, it will only be solved if you use the real YEN and WON and thus make sure that this particular lack in the keyboards does not cause the problem to already be there, anyway....
(Ben Bryant has been pouncing on me lately about this issue in the newsgroups, which he describes it in the post he wrote entitled Phantom Currency Signs in Japan and Korea.)
If you include that aspect then this is not a trivial problem to solve, either, especially when it continues to this day until and unless the keyboard issue is solved.
Now we have some of the tools in place to help here:
- As I pointed out here and here, both Japanese and Korean collations on Windows will equate the two, thus assuring us that people using the characters as currency signs will not be punished for using the actual currency signs
- The appearance of the REVERSE SOLIDUS is the same as the currency sign on Japanese and Korean systems.
Unfortunately, the PATH SEPARATOR is still a more crucial character on Windows than the currency sign -- imagine an OS with no paths! Would Windows even install? (probably not!).
Ben is right that making the YEN and the WON also act as path separators would have fixed the problem, but that suggestion is over a decade too late, now. Although it is a tempting bordering on tantalizing idea since the number or true YEN and WON characters in the wild has to be small given how they are on no keyboards....
Until reaility sets in and I remember that the NLS locale data properly assigns the true YEN and WON for the LOCALE_SCURRENCY fields for the Japanese and Korean locales, which means anyone who has ever used a program that uses GetCurrencyFormat might have them. And our group cannot punish users who use Unicode and NLS API functions properly without seeming like monsters!
Even if we added the character to the keyboard tomorrow, how do you document to users that these two characters that look the same should be used in these two different circumstances.
Which may be why, as I pointed out in When is a backslash not a backslash?, users in Japan and Korea are often fatalistic and resigned about it.
Perhaps one of the reasons that there is no big push to solve the problem is that Japanese and Korean are not widely used as languages outside of their respective countries, so there is not enough of a market segment that would benefit from adding the extra keys to the keyboard. At least not enough to outweigh the confusion.
But now I am thinking again, perhaps that segmentation of the market will provide as chance to solve the problem, too. Let me think on this one a bit more, I may have an idea....
This post brought to you by "¥" and "₩" (U+00a5 and U+20a9, a.k.a. YEN SIGN and WON SIGN)