Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
One of the more fascinating conversations I had with a customer at the recent Internationalization and Unicode conference was with someone who frankly admitted to be there on his own dime, trying to get enough information (well, he called it ammunition, actually) to help him his battle to get his company to try to move to Unicode in its products.
He sensed an underlying frustration I have in some of my blog posts relating to non-Unicode apps, one that he shared.
Our conversation was after my Windows Vista Language Support — How Does it All Fit Together? talk, so it was certainly past the halfway point in the conference. After nearly two decades of Law & Order I knew you are never supposed to ask a question you don't know the answer to, but I couldn't resist. I asked him if he felt armed now?
He smiled and said yes. But it wasn't just the sessions. That stuff was fine, but the thing he thinks will be the best weaponry in this war he is fighting (with a management team that simply refuses to recognize the importance of Unicode in their long-term strategy for entering new markets) was actually the other people he met there.
The ones who are having the same problems, the ones who have fought the same battles and won, even the ones who have fought those battles and lost. He knew that many people were struggling in this space, and he felt like he learned a lot of things about how best to bring on the right battles, which ones to avoid, and how to plan for winning.
On the last day he saw me just before I did my last presentation (Sorting It All Out: Even More Words on Collation) and told me he was "armed and ready." I told him "lock and load, dude." :-)
The night before, I had dinner with a bunch of folks who were at the conference. One of them had been in Martin Dürst's Internationalization of Ruby Scripting Language talk, and he mentioned how Martin had briefly talked about the U+005c issue, you know the one I have talked about quite a bit. His suggested offhand solution (which this person thought made some sense) was to start making the Yen/Won glyph fonts parked at U+005c less attractive -- like slanting them or whatever.
If it sounds to you like an attempt to sour the milk in order to wean the baby then you aren't too far off of the idea here.
I was horrified.
I mean, when you think about how (as I mentioned in When is a backslash not a backslash?) the real Won and Yen (U+00a5 and U+20a9) were just the best fit mappings in their respective legacy code pages, such a move would basically make all non-Unicode applications look bad.
Now I know sometimes -- e.g. in Double Secret ANSI, part 2 (the brokenest one yet, sorry 'bout that!) -- it may happen by accident. And then it is a bug to fix.
But the suggestion here was to actually have us engineer it intentionally.
And it got worse.
When I pointed out what this would mean for non-Unicode applications, I was glad to be sitting. Or else I would have fallen over.
He told me that Almost all applications are Unicode now, who cares about ANSI applications?
I mean no matter how much of Office is now Unicode or how much I talk about how the Unicode train is leaving or has left the station, it does not mean we have brought all the customers with us yet. Many of them are still behind waiting -- some waiting for Microsoft to send a better train for them to be in, some waiting for more proof that the destination is worthwhile enough for them to even want to travel.
But the world is by no means mostly Unicode yet.
After that I felt quite fired up and invigorated. We have to start helping to make that move easier and explaining the reasons better.
Time to get people to Unicode. It's time to lock and load, baby!
This post brought to you by \, ¥, and ₩ (U+005c, U+00a5, and U+20a9, a.k.a. REVERSE SOLIDUS, YEN SIGN, and WON SIGN)
I agree with Martin.
However, I take a much stronger stance on the issue.
I understand the historical reasons why 0x5C is a yen / won / backslash and why they all map to U+005C. However, U+005C is not a yen nor a won. It is a backslash (aka reverse solidus). Yen an won have their own respective code points.
A Unicode application that displays a yen or won for U+005C is broken. At least the font that it is using is broken.
The issue will be hard to fix. But an initial first step would be to fix the fonts. Then consider the layouts for Japanese and Korean keyboards. Should the key be a backslash or a yen / won? Probably a backslash, in my opinion. Yen and won can be entered via an IME.
Sure, fixing the problem will cause a lot of problems. However, the status-quo is also a problem, and has been for a long time.
Were it not for the fact that non-Unicode application using Shift-JIS (cp-932) would suffer so much, I might agree myself. But I know that a "solution" such as that would really impact many customers in Japan.
In fact, given the decision in the new Japanese font, it may even cause adoption issues for Vista for companies dependent on the behavior....
Ryan, there's no more chance of getting Japan and Korea to switch their keyboards to change the currency to a reverse solidus than there is getting America to accept putting the currency symbol (¤) on the Shift-4 and forcing everyone to enter dollar signs via IME. The key just needs to be mapped to the correct Unicode character.
the only real solution to dealing with the bajillions of legacy apps which use <shudder> Shift-JIS with or without extensions is to add a translation layer either in the OS or as a background app.
And the old becomes new again: an OS translation layer like the Atari 800XL used running as a TSR. How long before I can double my USB stick's capacity with a hole punch?
I hope the person who attended Martin's talk understood that the suggestion was made tongue in cheek...
Hard to say, though they seemed quite serious so I am inclined to doubt it...