Blog - Title

April, 2008

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!
  • Sorting it all Out

    A song that reminds me of a roof (which isn't there, anymore) and a girl (who isn't, either)

    • 0 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    Completely personal historical "life of Michael" stuff that can probably be ignored by people who aren't interested in that sort of thing!

    I found myself on the phone with Andrea again the other day, at an earlier time thankfully enough. :-)

    Although stunning and witty (as those things go), she usually does not take criticism as well as she claims to, which caused me to take the phone call with same delight with which one welcomes impacted wisdom tooth removal, after I blogged On the one who was Built on the Tale O' the Twister This Way.

    But she accepted it gracefully and then turned the issue around, asking me (if it had to be more about me) what I wanted to talk about.

    Ugh, I hate it when people use my words against me....

    So we found ourselves talking about my move to Hartford, that was almost a couple of decades ago -- from Philadelphia to Hartford. By the time I was done, she suggested that I should write about it.

    You are now reading it. :-)

    I moved because of a girl.

    Well, not exactly.

    I moved because I had broken up with a girl and suddenly the plans I had did not make as much sense as they did just before the break-up. If we were going to just end up being just a summer thing (she was in fact the one who unknowingly inspired me to suggest to Kathleen Edwards to keep Summerlong in her setlist back in 2005 as I discussed previously), then I was going to have to take stock and decide what to do with the rest of my life.

    I was probably a bit too young to be trying to think about the rest of my life, an error in judgment that caused more challenges down the road. But let's go back to that time and not look ahead for a moment....

    The original plans (moving to Providence, going to Brown) were out -- Providence is way too small of a town. Too easy to run into the wrong person. It has happened to me in New York City; Providence would never allow me to avoid it.

    I had spent a ton of money on college applications and although most were done to see if I could get in (my expenses were low so the big budget on applications for entertainment value seemed worthwhile at the time), I had lots of choices if I wanted to go somewhere else. I thought about it, very carefully.

    Hartford was not one of those choices, I had not applied to any schools there. I was taking time off and I threw a dart at a map of the US in a bar that I was not supposed to be at (being only almost 18 at the time). It landed in Enfield, CT -- so I flipped a coin to decide whether to make it Springfield or Hartford. And tails won.

    Now the dart and the map were just so I could say one day that's what I did. And I did live in/around the city for like half a decade (Hartford to Vernon to Manchester) and while working through one potential career that aspired to another (discussed previously) I managed to find the job that by the next half decade became my actual career, something I had kind of been doing all along anyway from 7th grade on.

    But anyway, back to Hartford.

    I was kind of broke.

    And when I say broke, I mean that I had enough money to buy a few packs of smokes or some cat food (I went with the cat food, she needed her Friskies more than I needed my Lucky Strikes). All of the money I had went into the 3x the rent (first month + last month + security -- needed for move-in) and although I was working on three different jobs, none of them were going to pay me for 1-3 weeks. Toward Thursday night, feeling like I was going to collapse from hunger, I actually ate some of the cat food, which was incredibly bland -- cats eat boring every day and like it, and then by the next day I had a paycheck and I haven't been quite that broke again since.

    And I was a little bit depressed, what with the whole thing that seemed like a breakup that not too long after turned out to definitely be a breakup. Food didn't help, even when it was less bland.

    My apartment was on the roof of a building on Allen Place (it was a studio that violated fire codes since it had only one exit, and it was easy to break into as the only possession I had of value -- my lost boom box -- would readily attest to were it not stolen the second night after I moved in). So I listened to my depressing mix tape with songs like

    • I Am a Rock (Simon and Garfunkel)
    • In My Room (The Beach Boys)
    • Up on the Roof (The Drifters)

    That tape was made for wallowing.

    The loss of the boom box made the tape less useful, though I still had the car cassette player when heading to and from work, and I knew all the songs so I just ran them through my head, and I guess played some on the mouth harp (they did not steal the harmonicas, I guess they did not find them when the took the radio. Or maybe they realized they could not get much for them).

    So anyway, it was briefly a huge wallow-fest. You probably would have been quite bored if you knew me (and I shunned the friends I had since I did not want to subject them to this).

    During the day I was at a day care center in the morning and an elementary/middle/high school in their after-school program in the afternoon, and I had to be cheerful at both of those jobs. I'm no actor but I think I did well enough for the kids, at least.

    And soon after that I started meeting people in Hartford and West Hartford and taking college classes and I felt kind of back on track. The "break" was over.

    But it is funny, none of the songs from that old tape really remind me of that time on the roof of the building on Allen Place.

    The song that reminds me of that time is Youth Group's Daisychains, a song not released until many years later:

    Listen now my sweet Anne, I never meant to cause you pain.
    We could've spent all summer sitting here making daisychains.
    I lie awake at night staring at my roof.

    Now you're gone...

    For weeks I've had your pretty face hanging in my brain.
    It's suspended like the reflection in a window pane.
    You hang just like a ghost over city streets.

    Now you're gone...

    How could I begin to finish what I couldn't start?
    I'm more General Haig than Napoleon Bonaparte.
    Go now, just leave. No more words please.

    Now you're gone

    Listen now my sweet Anne, I didn't mean to cause you pain.
    I could have spent all summer sitting here making daisy chains.
    I lie awake at night staring at my roof.

    Now you're gone

    (I verified the lyrics on the Youth Group site; like most songs out there, the Internet gets it wrong, mostly!)

    The girl wasn't named Anne but it ends up being close enough for it to work out. the song is nolt about being on a roof but starring at it yet that doesn't scare me off either. And although the relationship of the song is not much like the one I was in, for some reason the song just gets me.

    The whole situation was as defining for my life as that girl who inspired Michael Penn to write No Myth (ref: this blog post), the only real difference being that Michael Penn is much more talented/creative than I am, because blog posts are just not as catchy. :-)

    When I listen to Daisychains, In my mind I think I put General Dean (William Dean) rather than General Haig (Douglas Haig). Since I know much less about Haig and I had spent some time studying the Korean War, the Dean notion just fit better in my mind. Both of them had in common the fact that they were each often asked to take action in situations with which they were not entirely comfortable (being asked to do something that conflicted with their advice), in both cases leading to serious consequences (in the case of Haig causing many casualties among his men, in the case of Dean in his own capture and becoming a POW). Dramatic, that. But at that more simple time the relationship ending felt like it had that kind of effect on me, so at least I am being true to that prior version of me. :-)

    I was required to move out of that rooftop apartment due to those fire code regulations after four months, though the landlord kept my rent the same as the studio I moved out of for the two-bedroom I moved into as long as I didn't tell the building inspectors where I used to live. His idea -- I wasn't planning to become a snitch anyway. It was nice to have more space, and fewer break-ins.

    Before I moved out of the studio, Christine came up from Philly to visit and we hung out for a weekend or so. She can probably attest to it being a dump, though at least it was clean. She can definitely attest to the fact that I was dumped.

    Then I moved out of that building within a year or so of moving in there. It was nice enough, but somewhere between the drug dealers and prostitutes and drug users and Trinity students, I just needed to be somewhere that I could get more sleep and a building that had fewer police raids of the lower floors.

    I was just looking at the aerial view of the street in Virtual Earth a few minutes ago, and the street was there, just north of Trinity College. But the building doesn't seem to be there anymore, which seems a little sad. Not that I'd be up on the roof over there or anything (I'd be more likely to go up on the roof of the place on Walnut St. in Philadelphia where I lived before I even moved, and even that is pretty unlikely though that structure still appears to be there at least!).

    Since they aspasrently tore down the building then it is even more firmly in the past, somehow. But I wonder why that is if I wasn't going to visit anyway? Andrea says it is how we build roots -- we just assume that the places we leave will still be there after all the people go.

    While Andrea and I talked I had Daisychains playing on repeat, and it does remind me of that time on the roof of my apartment on Allen Place, when I put what I thought of at the time as my life back together. By the time I left for Columbus five years later I had found a better reason to move (the previous two were in responses to situations involving girls, so I guess I could claim I was becoming more mature).

    Somehow this song has transplanted itself atop a memory that was fully lived out and put into storage before the band had even fully formed, and long before the song existed. I wonder how that happened?

    Andrea didn't have answers on that issue, but we talked for a bit about her place in South Philly and it turns out she has memories that are analogous. I guess we all have these kinds of memories and maybe even songs we hear that remind us of them even if the song is new. She thought the unusual part was that I also connected it all with a person who had never been there, but since it is where I got over her, it does not seem all that unusual to me.

    Well, except for talking about it this much. :-)

    We hung up a bit before midnight, and I started writing.

    About a roof (that isn't there anymore) and a girl (who isn't, either), and a song (that wasn't there at the time but seems to represent it in my mind quite well under the circumstances).

     

    This blog brought to you by(U+2f27, aka KANGXI RADICAL ROOF)

  • Sorting it all Out

    On the clipboard, and why sometimes the dumber you are the better you can do

    • 2 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    Over in the Suggestion Box, Bruce Rusk asked:

    To keep a long story long: I have an application that I need to use--it's a Chinese dictionary that I use daily and would be seriously bummed to live without. It's a badly-designed Foxpro application, targeted to Traditional Chinese Windows (Big-5), and I run it in Windows XP with AppLocale.

    It's finicky. It doesn't display quite properly, but well enough to be used. The main issue is with input: there are two ways that I've found to input characters to search:

    (1) via the traditional Chinese IME
    (2) by pasting ... but only from Notepad. Pasting from any other application (Word, a browser, everything else I've tried) results in question marks in place of the characters. Being able to paste is essential to help look things up from texts.

    So my question is: will this rickety train continue to plod along after a Vista upgrade? Is there a way to tell beforehand whether Vista's IME will work and Vista Notepad will have the particular breed of whatever it is (braindeadness?) that makes its clipboard text paste properly? Or is there another way to massage the clipboard into the form this app seems to need?

    Thanks!

    The rickety train should keep muddling along, especially with either the use of AppLocale (though be sure to read the blog I wrote entitled The version of App Locale that runs on Vista? so you know how to get AppLocale on Vista!), or by changing the default system locale to a Traditional Chinese locale -- either way, this non-Unicode application will process the text.

    And now we get to the problem, which is that the the dictionary application, which does not know to accept Unicode, is asking for CF_TEXT rather than CF_UNICODETEXT. Many more intelligent applications will register multiple clipboard formats as being available and when your application asks for CF_TEXT, Word (which assumes it knows the exact code page of the system and which registers the formats it accept so it can copy later on demand) is able to give that.

    But then silly old Notepad just offers up CF_UNICODETEXT (it is actually even simpler than that but this is close enough for now, maybe I'll get into more later) and lets the system take care of the conversion by creating the text synthetically -- doing its own conversion to CF_TEXT. And since your Dictionary application running under AppLocale assumes the CP_ACP (default system code page) is 950, aka Traditional Chinese. The OS does the conversion and everything is done.

    Word and IE and others are skunked by their own cleverness; in providing "better" functionality and not using the default OS clipboard synthetic behavior, these applications do worse!

    There is actually no documented way to find out (when IsClipboardFormatAvailable(CF_TEXT) returns TRUE) which type of TRUE it means:

    1. REAL TRUE -- an application registered CF_TEXT and put the text into a buffer;
    2. PROMISED TRUE -- an application registered CF_TEXT and gave a callback which acts as an IOU for the OS to get the text later;
    3. SYNTHETIC TRUE -- an application registered CF_OEMTEXT and/or CF_UNICODETEXT but not CF_TEXT, but the OS knows it can synthesize it later as needed.

    Now the reason that #1 and #2 can be different is left as either an exercise for the student or for some future blog for me to write some day (and yes it is on  my list too, with lots of the information I learned from the MSLU days, including how to use CF_LOCALE as well!).

    Notepad (or rather the underlying control within it) only knows to send CF_UNICODETEXT, which can work anywhere via #1 when it is requested or #2 when one of the other formats is.

    But in the end all hope is not lost and you have two easy workarounds that requires no changes to applications and no external code to be written:

    • You can change the default system locale to one of those code page 950-using locales, or
    • You can run both your Dictionary application and the other applications under AppLocale using the same locale.

    The former will simply work everywhere, and the latter will work between your Dictionary application and any application you set up to work with it....

     

    This blog brought to you by(U+17a2, aka KHMER LETTER QA)

  • Sorting it all Out

    The importance of Tagalog to Burmese, aka "Of course I'd lie to you, I'm a font!"

    • 13 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    This is not a post discussing some kind of geopolitical issue involving Myanmar (Burma) and the Philipines.

    You see the other day, regular reader Andrew West, in a comment to my Who forgot the culture?, asked:

    Completely off-topic, but I notice that you embed the sponsoring character (U+1831, MONGOLIAN LETTER SHA) in an html font tag specifying "Mongolian Baiti" as the font face. It drives me crazy that IE7 (like IE6 before it) lists Mongolian as a "language script" that you can configure the font for, but it will not populate the font lists with any fonts regardless of how many fonts you have on the system that support Mongolian (including Mongolian Baiti), so it is impossible to actually configure what Mongolian font to use! The good news is that it does display Mongolian using Mongolian Baiti without explicitly specifying the font in the html, but the bad news is that I can't get it to use a Mongolian font other Mongolian Baiti without messing with the html. I just wish someone would fix IE ... or is this one of those Kafkaesque examples like Uyghur where Microsoft can't fix something, however broken, in case it breaks user expectations?

    I suspected that I knew what was going on here, but it was really worthy of its own blog and I wasn't sure how quickly I'd get to it so I recommended he put a note in the Suggestion Box just in case it wasn't going to be quick....

    Which he did:

    I know I'm not going to like the answer, but can you explain how the font configuration dialog in Internet Explorer works, in particular the behaviour for Mongolian (font list is never populated) and Myanmar (font list is populated with fonts that cover Tagalog, but none that cover Myanmar)?

    Then I had to cancel the blog that was happening for this slot and I ended up deciding to do it right away instead....

    Andrew is right that he probably won't like the answer, but it is something that is fixable, an even technically work-aroundable if a font author is willing to do something that hr or she would ordinarily consider to be very stupid. :-)

    Perhaps I should explain.

    We'll start in the Tools|Internet Options... Fonts... dialog:

    (I guess I have no Tagalog fonts!)

    and for good measure we'll include one that has some fonts in it:

    Now in the end the information on actual selections is stored in the registry, under

        HKCU\Software\Microsoft\Internet Explorer\International\Scripts

    which is clearly an Internet Explorer settings key with SCRIPT ID values 36 (Myanmar) and 39 (Mongolian):

    But for the list of potential fonts, that is not IE at all; that is MLang.

    Now I blogged this a bit over two years ago in Where are the IE plain text fonts?, and in that blog I mentioned:

    Now the actual population of the two lists is happening via MLang, and as Paul points out you could think of the list on the left as being for proportional fonts and the list on the right as being for fixed pitch (monowidth) fonts.

    MLang goes through a two step process that I will get into in another post, coming soon. :-)

    And since I never did get back to it, I guess Andrew has proof that things often get lost if they aren't put in one of those lists like the Suggestion Box! I am actually happy to have the proof because otherwise I look kind of petty or something with my request....

    Anyway, I'll explain it now -- it all works via a Trust; But Verify! mechanism.

    The Trust part is where it trusts the font to describe its own Unicode ranges in its own internal FONTSIGNATURE.fsUsb bits, the Unicode Subset bits. That is step one.

    The Verify part is where it does a spot check on a specific Unicode code point in th script range, to make sure that the FONTSIGNATURE is not lying. Because FONTSIGNATUREs, like men, lie. Like that bit from the movie Up the Creek an it's fictional typographical version Up the Foundry between Tim Matheson (as the font) and Jennifer Runyon (as the user):

    Font: I will tell you about my coverage.
    User: You wouldn't lie to me?
    Font: Of course I'd lie to you, I'm a font. But I'm not lying now....

    In fact, it really relies on that Verify step and perhaps even skips the Trust step a bit, sometimes?

    And it spot checks the font CMAP to make sure a specific candidate character is in it.

    I mentioned there was as problem here, didn't I?

    Here is where the problem sits.

    Deep in the heart of MLang, in its mlflink.cpp source file, it has:

    • Myanmar with a candidate character of U+1700 (ᜀ, aka TAGALOG LETTER A)
    • Mongolian with no candidate character given at all.

    And this is why Mongolian never shows up (since it has no explicit character to check for) and Myanmar shows up when your font has Tagalog (since that is the character it looks for).

    Which is the essential workaround for Myanmar -- add that one specific Tagalog character to your Burmese font? Totally obnoxious, but until/unless someone fixes MLang....

    Let's put all the values in a table so you can see them:

    Script Script Id Code point Character Character Name
    Greek 5 U+03ac ά GREEK SMALL LETTER ALPHA WITH TONOS
    Cyrillic 6 U+0401 Ё CYRILLIC CAPITAL LETTER IO
    Armenian 7 U+0531 Ա ARMENIAN CAPITAL LETTER AYB
    Hebrew 8 U+05d4 ה HEBREW LETTER HE
    Arabic 9 U+0627 ا ARABIC LETTER ALEF
    Devanagari 10 U+0905 DEVANAGARI LETTER A
    Bengali 11 U+0985 BENGALI LETTER A
    Gurmukhi 12 U+0a05 GURMUKHI LETTER A
    Gujarati 13 U+0a85 GUJARATI LETTER A
    Oriya 14 U+0b05 ORIYA LETTER A
    Tamil 15 U+0b85 TAMIL LETTER A
    Telugu 16 U+0c05 TELUGU LETTER A
    Kannada 17 U+0c85 KANNADA LETTER A
    Malayalam 18 U+0d05 MALAYALAM LETTER A
    Thai 19 U+0e01 THAI CHARACTER KO KAI
    Lao 20 U+0e81 LAO LETTER KO
    Tibetan 21 U+0f40 TIBETAN LETTER KA
    Georgian 22 U+10d0 GEORGIAN LETTER AN
    Ethiopic 27 U+1300 ETHIOPIC SYLLABLE JA
    Canadian Syllabics 28 U+1401 CANADIAN SYLLBICS E
    Cherokee 29 U+13a0 CHEROKEE LETTER A
    Yi 30 U+a000 ꀀ YI SYLLABLE IT
    Braille 31 U+2800 BRAILLE PATTERN BLANK
    Runic 32 U+16a0 RUNIC LETTER FEHU FEOH FE F
    Ogham 33 U+1680 OGHAM SPACE MARK
    Sinhala 34 U+0d85 SINHALA LETTER AYANNA
    Syriac 35 U+0710 ܐ SYRIAC LETTER ALAPH
    Myanmar 36 U+1700 TAGALOG LETTER A
    Khmer 37 U+1780 KHMER LETTER KA
    Thanna 38 U+0780 ހ THAANA LETTER HAA
    Mongolain 39 -0-  

    You can probably see the other problem here -- all of the scripts that are missing; perhaps the fix needs to be a bit more than just the two broken ones, in the long run....

    Speaking of which -- any NLS testers stirring about who'd like to enter a bug on this small bundle of MLang issues that will also affect IE8 on the next version of Windows if it isn't fixed? :-)

     

    This blog brought to you by(U+1700, aka TAGALOG LETTER A)

  • Sorting it all Out

    Fight the Future? (#1 of ??), aka The inappropriate nature of getting the Feh out of Uighur

    • 7 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    So it was just within the last couple of weeks that comments end up on the Vista Team Blog, like one from Shanghai software engineer Abdusalam, aka VistaUyghur, in this post:

    Greetings,

    Well, this seems to be the place to send feedbacks on Vista SP1 RTM for now, right?

    Then let's start.

    I am an unofficial (non-Microsoft Connect user) tester for Windows Vista SP1 RTM now. After the first relaese of Vista RTM, we did a test on it.  As a result, we found a very serious problem, that was/is the IME issue related to the KEYBOARD LAYOUT of Uyghur (aka Uighur in Vista) language.  Afterwards, we sent a feedback to Microsoft China, and they DID AGREE that they will fix this through the coming service pack, SP1.  However, the problem still exists in the SP1 RTM for Vista.  I'm not sure if this issue will be fixed through later hot fixes.  I think this blog site is apparently not the place to provide too much details.  Here I would like to know how I can send this important (for us, maybe also for you) feedback info to you or Microsoft, DIRECTLY.

    Thanks.

    Or the one done several days prior and then repeated one day prior by someone with the handle Uyghur in this post,l which has the advantage of spelling out what the reported problem is:

    Dear Mr. Nick White,

    We are Uyghur (Uighur) and Microsoft Windows users. We were very excited when Microsoft started to support our language and script in its Windows Vista operating system. When the Beta version was released, we tested it extensively, found many bugs, and reported them to Microsoft. Later we were told that these bugs would be fixed in its official release. We found the same bugs again in its official release and reported them again, and were informed that they would be fixed in SP1. Now the bugs are still there in SP1 and we are very frustrated.

    This serious bug is about the Vista's support of our language and script - Uyghur (Uighur). (see
    http://en.wikipedia.org/wiki/Uyghur_language for more details about the language and script.)

    Our script - Uyghur (Uighur) is an alphabetic script with 32 letters, based on Arabic and written from right to left. In Vista, Microsoft's support of our script comes with a font named MS Uighur, an input method and a keyboard layout.

    In Unicode standard, national and local standard, one of our letter, F's unicode number is 0641, but Microsoft have used 06A7 instead, resulting in serious incompatibility issues.

    Prior to Vista and until now, we have been processing our script on Windows 98, 2000, and XP with third party fonts, input methods, and keyboard layouts, using unicode character 0641 for our letter F.

    With this serious incompatibility problem, we have been in great difficulty in migrating from previous versions to Windows Vista. We hope Microsoft and the Vista Development Team take this issue seriously and help us using Microsoft products easily and comfortably by fixing these bugs in time.

    Sincerely,

    On behalf of Uyghur (Uighur) people

    A Uyghur (Uighur)

    First let's look at these two letters:

    U+0641    ف    ARABIC LETTER FEH

    U+06a7    ڧ    ARABIC LETTER QAF WITH DOT ABOVE

    Okay, we know there are some similarities but they are two different letters, clearly.

    Let's take a quick gander at that keyboard, particularly the VK_F key, in both the base state and the shifted state:

    Yep, there it is.

    Now you might be able to see given th similarities how someone might have made a mistake (whether in the subsidiary, in Xinjiang, or in China -- I am not sure where the .KLC file was produced myself, all I can see at the moment is that the file was given to me and I checked it into the Vista project on April 19, 2005 a bit after midnight Redmond time).

    And that the keyboard I checked in then had the right character in it.

    THEN, on July 21, 2006 at about 2:30 AM, in direct response to a bug report, the change was made to put what is now being called the wrong letter....

    Note that this was very late in the cycle for Vista and required some extra information on the justification.

    The comments provided by the people who looked into the bug report at the time explained:

    The bug is a small problem of the letter assignments in the keyboard, which leads to big usability problems for users trying type of Uighur.

    We need to change a couple of code point mappings on the Uighur keyboard layout, since currently some keystrokes produce unexpected results for the user. The keyboard doesn't work.

    Doubts were raised at the time but those doubts were overridden by the strong feedback about the usability issues if the bug was not fixed....

    Anyway, to answer the questions raised by the people who reported these problems late last month and despaired that they were not addressed in SP1, it seems like there are different forces at work here.

    To make it more fun, let's look at these two characters in Tahoma and Microsoft Sans Serif and Microsoft Uighur, blown up to 48pt:

    I am starting to understand why a former colleague of mine used to refer to Tahoma as a "crap cartoon font" (mentioned before here and here) for the Arabic script, and I'll go out on a limb and suspect that this kind of thing might have played a part in the [possibly incorrect] feedback?

    In any case, the FEH is much more likely to be the right letter here, all things being equal. And the earlier last minute bug report was probably in error.

    I am not on the NLS team any more and would not pretend to speak for them here, but if someone in Microsoft China made such a claim about timing as Abdusalam mentioned then that person spoke out of turn as this is a complicated issue to manage and solve and service and maintain -- since of course Microsoft can't change a keyboard layout even if it is provably wrong (ref: here and here), even if in the future the right layout were produced and added to Windows (and such things in service pascks are pretty much unheard of).

    In the meantime, MSKLC is a great workaround to get the keyboard layout one wants, and perhaps if U+06a7 is not used in Uighur (as it appears not to be) then some future version fo Windows could fold these two characters together for collation purposes like we did for Romanian and its comma below/cedilla characters! :-)

     

    This blog brought to you by ف and ڧ (U+0641 and U+06a7, aka ARABIC LETTER FEH and ARABIC LETTER QAF WITH DOT ABOVE)

  • Sorting it all Out

    Have you ever been hooked by the tale o' Lex twister?

    • 1 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Note that this post is entirely offtopic and if that kind of thing bothers you then you re invited to get out right now....

    So it started with the song Tail O' the Twister by Chagall Guevara (well, by Steve Taylor) that made it onto the CD but not the cassette of the Pump Up the Volume soundtrack.

    Especially the way I connected the look on Liz Phair's face on the cover of her eponymous album that I had the autograph on:

    Now the look on her face is not a come hither one at all. It is (as Steve Taylor described in the song):

    A barstool yawn to a stuttered come on, it's a dirt road rut, she said 'button up mister'

    You know, kind of an "I'm looking, but you're not all that" sort of look. A look which, as I mentioned, I have seen before, as a few women have had it in the past looking at me. :-)

    Like just last night at Allison's Bat Mitzvah party, when I explained the look to my cousin Alexis (Coop to many who do not get distracted by The O.C. reference, and Lexie to family and no one else), and she spontaneously came up with the following (by spontaneous I mean no direction as to where to put her head or anything):

    Lexie lives in New York now and is keeping up quite nicely (something I was only okay at doing when I was there) and I imagine many young guys who do not pass muster get that same look.

    At some point her shoe was thrown into a pile where guys at the party got to dance with whoever's shoe they got, and she made some 13 year old's night for being able to have a dance with an older woman.

    I don't have that picture (someone else got it) but I'll probably be uploading a lot of the ones I do have evenually and then tagging various cousins in facebook (whether they are expecting it or not!). I just felt like this Tale O' Lex Twister moment had to be captured separately in this quite off-topic blog. :-)

     

    The characters in Unicode should be able to resume their blog sponsorship requirements soon...

  • Sorting it all Out

    Fight the Future? (#8 of ??), aka The Bug(s) Spotted, aka Design flaws are worse than bugs

    • 7 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    It seems like it was just days ago when I blogged Fight the Future? (#2 of ??), aka Spot the Bug(s)!, which provided the following code and asked (dared?) people to find the problem(s) therein:

    if( _plv->_ci.dwExStyle & WS_EX_RTLREADING)
    {
        if (item.pszText)
        {
            //
            // temp hack for the find.files to see if LtoR/RtoL mixing
            // works. if ok, we'll take this out and make that lv ownerdraw
            //
            if ((item.pszText[0] != '\xfd') && (item.pszText[lstrlen(item.pszText)-1] != '\xfd'))
            {
                textflags |= SHDT_RTLREADING;
            }
        }
    }

    Now many people pointed out that this code can't handle when the text in item.pszText is a zero-length string (leading to at best a not-so-nice one charcter buffer underrun), and several also went further in guessing that item was meant to be an LVITEM Structure in a list-view control. One person even pointed put what Ben referred to as the principal error being targetted when he asked the question initially:

    Short answer:
    The comparison is always false because WCHAR is zero-extended but '\xfd' is sign-extended.

    Long answer:
    Among the solvers, there was confusion about which half of the comparison was signed.  In our project, WCHAR is unsigned, and char is signed.  So '\xfd' is sign extended and pszText[0] will not be, so they always compare as unequal.

    This is most evident from the assembly code.  Did you know one dev even closed this bug since they couldn't believe the compiler would do such a thing?


    75ae1e1f 0fb708   movzx   ecx,word ptr [eax]   // zero extend a WORD (pszText[0])
    75ae1e22 83f9fd   cmp     ecx,0FFFFFFFDh       // compare to a DWORD literal

    One person noted that there should be a compiler flag to keep this sign mismatch issue from happening. He is right, it is /J, described in the topic /J (Default char Type is unsigned).

    But although this bug does keep the code from ever working, it still is just a bug -- and a quick fix once it is identified.

    And then several people noted the more serious issue -- the fact that the code is using a non-Unicode chracter ('\xfd') in the comparison against Unicode characters in the item.pszText -- in an attempt to look for U+200e, the LEFT-TO-RIGHT MARK (which is only '\xfd' in Windows code pages 1255 and 1256, and not in Unicode ever).

    Now we get to where the wheels fall off the wagon a bit. :-)

    There was a bit of suggestion with this described problem what the actual fix would be, and the "internal" answer for all of this was pretty direct:

    The deeper bug requires some context.  One person was kind enough to provide some detailed history:

    "It's worse than you think.

    The code was originally written for the Mideast version of Windows 95. That version of Windows uses ANSI not Unicode, and the code pages are 1255 (Hebrew) and 1256 (Arabic). In both of those code pages, character 0xFD is Unicode character U+200E (LEFT-TO-RIGHT MARK). The code was protected under #ifdef WINDOWS_ME so it would be active only on Arabic and Hebrew systems.

    This code was ported to Unicode without paying attention to the code page assumption hiding behind the #ifdef. Lucky for us, the code was ported incorrectly and the test never succeeds.  A naive "fix" would corrupt Czech strings: The comparison would think that character U+00FD (LATIN SMALL Y WITH ACUTE) is the LTR marker and any string that begins and ends with that character gets treated as Arabic/Hebrew text.

    The correct fix is to delete the test entirely. We are all-Unicode now. We don't need an old hack for Hebrew/Arabic Windows 95."

    But in some ways I find this answer a little bit wrong and also way less than complete, to be honest.

    The hint for my issue here can be found if you look at the comment:

            //
            // temp hack for the find.files to see if LtoR/RtoL mixing
            // works. if ok, we'll take this out and make that lv ownerdraw
            //

    What kind of temp hack designed to test a specific feature lives on long enough to make it to a Unicode conversion, intact?

    Now the problem with LtoR/RtoL mixing does not go away when you convert to Unicode -- it just gets harder. And the initial hack was indeed a hack because it was never really such a great solution being given.

    You can see the underlying real problem in action with the user interface language list -- shown here in Vista on that machine with all of the home-built locales, with an English UI language:

    though not in the smaller "official" list with an English user interface language since there are no RTL languages with parentheses listed:

    though the bug comes backi to haunt us with a right-to-left user interface language with many examples:

    Now this is yet another case of the problem I talked about in Mixing it up with bidirectional text, where any time you "islands of text" within other text that:

    1. does not have the same directionality as the overall user interface, and
    2. either the first character or the last one has a neutral Bidi class

    then one needs to put in a non-neutral character such as U+200e (LEFT-TO-RIGHT MARK) or U+200f (RIGHT-TO-LEFT MARK) -- depending on the desired directionslity of the island.

    The "old" temp hack fix -- presumably only running on Hebrew or Arabic Win9x -- was

    • a bit incomplete (since it never added characters even if it needed to) and
    • a bit heavy handed (since many strings would need the flag set, basically any with strong LTR characters at the beginning or the end) and
    • a bit short-sighted (since it oly looked at the problem or LTR text in an RTL world, not the converse scenario).

    So the person who suggested the code could just be removed was right -- if you are willing to live with strings that have serious potential to look wrong in any of those directiolaity-spanning scenarios.

    What should be there? Well, an algorithm that:

    1. Looked at the directionality of the first and last characters/pieces in the string, and
    2. Looked at the directionality of either the surrounding text or the user interface (whichever was appropriate), and
    3. Whenever a difference between either/both sides of #1 and what was found in #2 was seen, added the appropriate RLM or LRM marker to cause the text to look right. kind of like I suggested in Mixing it up with bidirectional text.

    Now there are two ways this bug can manifest -- if the character in question is neutral and mirrored, it can show up on the wrong side of the string, reversed. And if it is just neutral but is not mirrored, then it will look right but still be on the wrong side of the string.

    Both problems are fixed by the above description of the algorithm (the code for which is left as an exercise for the reader).

    In fairness to that initial code, it was a temp hack presumably to test whether using strong LTR characters would help with directionality of listview items on RTL platforms, but obviously if the code is never removed and then the actual issue is never fixed then there is clearly a bug here -- a bug caused by the lower quality bar implicit in a temp hack which, by never being revisited, proved the underlying problem in ever lowering one's quality bars in code one checks in.

    Which in my mind is the most serious bug here -- the conceptual design flaw caused by never finishing the work to solve a genuine issue.

    It would be great if this code were written up in a function, which could then be used in all of the places in the UI where such strings can or do show up, from Listviews to Listboxes and beyond....

     

    This blog brought to you by ‏)‏ (U+0029, aka RIGHT PARENTHESIS, mirrored of course due to a surrounding U+200f entourage...)

  • Sorting it all Out

    The mythical nature of bidirectional support, and where the wheels come off the wagon

    • 9 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    The problem has its roots in Mixing it up with bidirectional text and The Bug(s) Spotted, aka Design flaws are worse than bugs, two blog entries which talk about specific lamenesses with the bidirectional support within Windows.

    I don't want to imply that there aren't more problems beyond these. Because to be perfectly honest, there are.

    Microsoft is incredibly lame here, though to be frank for a moment only lame in a way that everyone else is too, right now. Including Unicode.

    To illustrate, I'll need a sample bit of text.

    Let's build up a path. :-)

    We'll take a nice little English string:

    NAME ‎(BIG)‎

    And then we'll make another one in Hebrew, kind of a localized version of that string.

    שם ‏(גדול)‏

    It is really quite reasonable to hope one could take these chunks, create a path with them (one chunk per directory) and have everything come out right.

    I mean a path like:

    C:\NAME ‎(BIG)‎\שם ‏(גדול)‏\NAME ‎(BIG)‎\שם ‏(גדול)‏

    may be a Destryian scenario, but at its root it's just a small valid scenario that you would really want to work.

    Let's try it with no special decorative control characters and leave it to the whim of your browser:

    C:\NAME (BIG)\שם (גדול)\NAME (BIG)\שם (גדול)

    It didn't look right on all four that I tried (Safari, FireFix, Opera, and Internet Explorer).

    How about in Notepad?

    Well you can choose your means of failure there via the right-click menu:

    vs.

    Let's try it on the latest and greatest version of Windows, as a path:

    Hmmmm. Not so great in the breadcrumb bar, huh? What if we click in the address bar space to get rid of the breadcrumb bar:

    Still broken, those tokens. All of the English ones look fine, but the Hebrew ones are broken.

    Maybe we can do better on a Hebrew user interface language.

    We'll look at the breadcrumb bar again:

    Well, good news and bad news here -- the Hebrew looks good now, but the English is broken!

    Is the hope for

    C:\NAME ‎(BIG)‎\שם ‏(גדול)‏\NAME ‎(BIG)‎\שם ‏(גדול)‏

    such a fruitless one? So very unreasonable?

    Turns out that if you are running on Windows, it is. :-(

    Now obviously you can do some work here with U+200e (LEFT-TO-RIGHT MARK) and U+200f (RIGHT-TO-LEFT MARK) or other Bidi control characters to try and make this better, but obviously this is something one wants to have happening behind the scenes without requiring the user to add control characters to the string.

    Especially a string where the intent is so obvious and easy to discern.... a slightly more complicated case than the one in Mixing it up with bidirectional text but not all that much more complicated, is it?

    But it is by no means an easy problem for users to have to solve. so it really would be much better if the OS could do the heavy lifting here, rather than forcing it on everyone else.

    Which is not to say there is some other operating system that magically does everything right here. Last time I checked, no one was doing so well in this space, and bidirectional support in these edge cases is kind of a myth for now....

    Let's pause to do a little RCA (Root Cause Analysis) for the problems here -- that as a standard, the Bidirectional Algorithm is several levels lower than one needs to handle the mix of LTR and RTL scripts, and the various "clients" who more or less support the standard (be they application or operating system or browser or other) but do not provide a whole lot beyond it (other than sometimes providing that notion of a higher level definition of default directionality). It does quite well with cases like Hebrew that actually have some LTR pieces within themselves, but there is no good way to handle other script LTR text embedded within unless a bunch of other work happens. Work that no one really wants to provide. Remember what that one person said in response to that hack bug:

    "The correct fix is to delete the test entirely. We are all-Unicode now. We don't need an old hack for Hebrew/Arabic Windows 95."

    No one wants to do too much beyond Unicode even though plain Unicode alone (without making use of higher level protocols to place control characters) is insufficient for handling these cases....

    Note that is also also one of the reasons RTL IDN is so complicated and looks so broken most of the time.

    It all amounts to A place where everyone blows, equally.

     

    This blog brought to you by U+200e and U+200f (aka LEFT-TO-RIGHT MARK and RIGHT-TO-LEFT MARK)

  • Sorting it all Out

    Even if the text is right underneath, it may look wrong close up....

    • 1 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    Regular reader Jan Kučera asked over in the Suggestion Box:

    Hi again,

    I know the behaviour I mention here is not problem you can solve, but I'm interested in handling RTL fragments in "plain text". What I've encounted is like this.

    I have both IMAP and web access to my e-mail. I don't have a SMTP server, so I send the mails from web and read them in Outlook 2007. One day, I wanted to know the author of Hebrew lyrics to the Maya the Bee song, so I wrote an e-mail to an Izrael TV which had a page about the series. The title of the e-mail was "Maya the Bee (הדבורה מאיה)" and I repeated these words in the message body. Need to say, I have the web mail configured to write plain text e-mails.

    The surprise came with the answer. On the web, everything was okay, as I had written it. But in the Outlook, although the title remained ok (the hebrew phrase being selected from right to left), in the message body, I saw הדבורה first, followed by מאיה, letting the user select the row with hebrew text without troubles, char by char, from left to right.

    When I copied it and pasted to the notepad, everything was ordered and behaving okay again.

    The mail was encoded in 1255 and the sender used Thunderbird 2, but I don't think this is too important since in IE and other applications the text is formatted as it should.

    What is more important is the title, encoded as "Subject: Re: Maya the Bee ( =?windows-1255?Q?=E4=E3=E1=E5=F8=E4_=EE=E0=E9?= =?windows-1255?Q?=E4=29?=" which could prevent Outlook from interpreting badly the title too.

    E-mail reply was in HTML.

    Now, the question is, beside whether this is a bug at all, how could be RTL phrase rendered in LTR, and what could we, as developers, do to avoid this issue in our programs.

    PS: The answer to my question is Dan Zakai (דן זכאי). Or... דן and זכאי, as shown by Outlook? :)

    It is actually not that hard to discern the relationship between

    הדבורה מאיה

    and the weird part of the string in

    "Subject: Re: Maya the Bee ( =?windows-1255?Q?=E4=E3=E1=E5=F8=E4_=EE=E0=E9?= =?windows-1255?Q?=E4=29?="

    Just look at that Windows code page 1255 chart:

    • E4    - U+05d4, aka ה, HEBREW LETTER HE
    • E3    - U+05d3, aka ד, HEBREW LETTER DALET
    • E1    - U+05d1, aka ב, HEBREW LETTER BET
    • E5    - U+05d5, aka ו, HEBREW LETTER VAV
    • F8    - U+05e8, aka ר, HEBREW LETTER RESH
    • E4    - U+05d4, aka ה, HEBREW LETTER HE
    • EE    - U+05de, aka מ, HEBREW LETTER MEM
    • E0    - U+05d0, aka א, HEBREW LETTER ALEF
    • E9    - U+05d9, aka י, HEBREW LETTER YOD
    • E4    - U+05d4, aka ה, HEBREW LETTER HE
    • 29    - U+0029, aka ), RIGHT PARENTHESIS

    So it is some kind of encoding of text into cp1255 with the text in appropriate logical order that anyone who understands the format should be able to use to decipher the text.

    And on the other hand anything that doesn't understand the encoding technique is quite apt to misinterpret it and not show what us expected....

    For the body, if whatever control is holding the body knows how to properly use the Unicode Bidi algorithm then it will properly display the text, though the behavior Jan describe that at least some pieces do not know how to interpret the text properly. The fact that it does not corrupt the text makes it somewhat easier to be okay with the interim display issues. :-)

    Avoiding this kind of issue? More or less the answer us to avoid processing text in these interim stages, since it is likely way too easy to corrupt the text in the meantime.

    Other recent posts of mine like this one and this one and this one jump into the handling of RTL fragments with LTR text and LTR fragments within RTL text. Which is not easy under the best of circumstances though tune in a I might suggest some additional methodologies to consider. :-)

     

    This blog brought to you by ה (U+05d4, aka HEBREW LETTER HE)

  • Sorting it all Out

    Uncheck != uninstall (even in SP3 or SP* for that matter)

    • 0 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    Via the Contact link, Stapp asks:

    Hello there,

    I read your thread here
    http://blogs.msdn.com/michkap/archive/2007/06/14/3288145.aspx

    I have the East Asian pack installed by default but never wanted it, and now I have installed SP3.

    Do you know if SP3  allows the removal of that language pack by just unticking?
    It would be better if someone more competent than me could test this!
    Thanks for reading
    Stapp

    The two relevant posts here are The one-way trip of installing supplemental language support (the one Stapp pointed to) but also the more importantly relevant Unchecking the checkbox does not necessarily mean 'uninstall'.

    It is that latter post that explains that you can uncheck the checkbox to disable the East Asian support by removing all of the relevant registry keys pointing to files and such.

    It won't remove the hundreds of megabytes of files, but it will remove the functionality itself and leave the files there, unused.

    SP3 neither improves this situation nor makes it any worse -- same results after it is installed.

    And then in Vista and Server 2008, neither install nor uninstall is available -- the files are always present....

    Now given the text that appears when you do that uninstall:

    I agree this is a bug - the clear implication of the text is that the files are being removed. But this is a bug that I doubt is ever going to be fixed in any service pack of QFE/hot fix.....

     

    This post brought to you by(U+8b43, a Unified CJK ideograph)

  • Sorting it all Out

    I Adar you! Hell, I Double Adar you!

    • 8 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    You probably wouldn't ever have guessed, but this blog is going to be about the Hebrew month of Adar (אדר).

    Now most years it is a nice tidy little month, but the Hebrew calendar starts jumping too far ahead if left to its own devices, so seven out every nineteen years an extra month is added -- generally this is known as intercalation.

    This happened in the very year in which we now are, as fate would have it.

    And here is where we run into issues.

    You see, this extra אדר (Adar) stuff has been going on for a long time.

    And אדר (Adar) has some interesting holidays in it, like תענית אסתר (The Fast Of Esther) on the 13th of Adar, פורים (Purim) on the 14th of Adar, and שושן פורים (Shushan Purim) on the 15th of Adar.

    This leads to an interesting question when there are two of those אדר (Adar) months popping up -- which אדר (Adar) do we use to celebrate?

    Now קראים (Karaite Jews), or perhaps we could call them (for lack of a better term) Biblical Jews, keep themselves in the world of the תנ״ך‎ (The Tanakh, the Jewish Bible). This is as opposed to the (for lack of a better term) Rabbinical Jews, who have the משנה (Mishnah) and גמרא (Gemara) as a huge amount of additional commentary and law and discussion and argument.

    So why is this interesting?

    Well, those קראים (Karaite Jews) celebrate the holidays in the first אדר (Adar), and the rest celebrate them in the second אדר (Adar) - based on text in the משנה (Mishnah) that instructs as much. Which kind of explains why the קראים (Karaite Jews) don't heed those rules, since they don't consider the משנה (Mishnah) to be law, after all.

    So most Jews look at the 14th of that first אדר (Adar) as פורים קטן (Purim Katan -- "Little Purim") and the 15th as שושן פורים קטן (Shushan Purim Katan). There aren't any specific rules on things that must be observed or anything, but there is kind of a minor festivary aspect for people who have a bit of a desire to "get their party on" as often as they can. And I have been to a couple of עדלאידע celebrations over the years (עדלאידע is one of those fun words that make for a great party theme -- it means "until you don't know" because you are supposed to keep drinking until you don't know the difference between the good guy and the bad guy of the story of Purim. I am sure you can imagine a drinking game that can come out of this quite easily!).

    Okay, so we have פורים קטן and שושן פורים קטן and תענית אסתר and פורים and שושן פורים. Got it?

    Now we'll add computers to the mix.

    Specifically, we'll add Windows -- which calls these months אדר (Adar) and אדר ב (Adar Bet/Adar 2), which freaks out some people because they would prefer something more like the .NET side of the world has it in the HebrewCalendar class with an אדר א (Adar Alef/Adar 1) and an אדר ב (Adar Bet/Adar 2). And to make matters worse, some Microsoft products reportedly call the month אדר א (Adar Alef) during non-leap years, when technically it should be (if one had to choose between the two), something more like אדר ב (Adar Bet).

    And there are also random bugs reported in programs like Outlook (as this site points out in the article Hebrew calendar leap year mistake).

    And the story of פורים as told in the מגילת אסתר (Book of Esther) has interesting weirdnesses on its own -- what with Queen Vashti (ושתי) being asked by the king to dance naked for the court, and when she refused, she was killed1 and all of the other interesting pieces, including the bit about how the decree to kill the Jews could not be set aside, but a second decree to allow the Jews to defend themselves was legal and so the battle was not so one-sided as it might have been otherwise. All I know if that if I was king and I was the sort to have people naked dancing for the court, I would be allowed to do whatever the hell I wanted and reverse any decree that seemed like a bad idea, especially if I drank as much as this king reportedly did and the next morning realized that not every drunken decree is necessarily a good idea....

    In the end, the report of the difference between Windows and .NET is a perpetual thread -- raised each time someone notices it, possibly with a bug or several bugs put in. But (ignoring the reported bug in Outlook with the recurring year mistake2), there is technically not a bug here, though there is an inconsistency, and the people who take offense at the implied precedence of the first אדר (Adar) being called plain old אדר (Adar) are free to their opinions but maybe they would not feel strongly if there were not reported bugs implying people were misunderstanding the rules.

     

    1 - According to Jewish sources, some Christian sources just have a divorce happening, though given the king -- known to have people executed even for appearing when they are not called -- this seems a bit of of character.
    2 - which I will conditionally choose to believe knowing how they mess up with Diwali and all. :-)

     

    This blog brought to you by מ (U+05de, aka HEBREW LETTER MEM)

  • Sorting it all Out

    Y oh Y does YYYY sometimes mean YY, you ask?

    • 6 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    You know, in many cases the date that you run into a problem can be rather ddirectly related to the actual problem.

    I'm not thinking of what happens when you wear orange on Saint Patrick's day in Fishtown; I was thinking of programming issues. :-)

    Frank Hauptlorenz's question to the microsoft.public.dotnet.internationalization newsgroup the other way was a really good example:

    SUBJECT: DateTime.Now.toString("yyyyMMdd") depending on CultureInfo?

    Hi,

    I have this instruction mentioned at top running on an japanese client.
    Strangely this renders to "200809" (just year and day).

    Has anybody an idea?

    Thank you,
    Frank

    Now everyone can refresh their memory (or fill their cache, whichever!) by reading Long live the Emperor (ignoring the issue therein but paying attention to descriptive information) for some background on the Japanese Wareki calendar before I start blathering anew here....

    Okay, we're all on the same page now?

    Good.

    This is a calendar that does not account for the idea of a reign longer than 100 years.

    I mean, seriously -- how many of the 125 listed in Wikipedia's List of Emperors of Japan spanning 660 BC to present reigned for over 100 years?

    If you do the math, you see an average reign of a little over 21 years, which is astounding given how young one could be to ascend....

    But in any case, Windows follows suit here, and passing YYYY will be ignored because only YY is really going to be paid heed.

    But (and this is where the timing comes into play) all of this was harder to spot since we are in the 20th year of the 平成 (Hensei) era which happens to be in the year 2008.

    So when one is looking at 200809, one might tend to see 2008/09 rather than 20/08/09.

    I guess you could think of this as one of the gotchas of ignoring cultural conventions (like that yyyyMMdd format) while still doing things in the context of a culture. Because culture will bite you in the butt if you aren't paying it its proper respect! :-)

     

    This blog brought to you by 𐅹 (U+10179, aka GREEK YEAR SIGN)

  • Sorting it all Out

    Fight the Future? (#5 of ??) aka How to avoid jury duty without feeling guilty or offended?

    • 4 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    I actually don't fully understand what happened.

    I was asked to report for jury duty.

    No worries, I thought. All I cared about was whether or not all of the places I might have to be were accessible. I assumed I had to serve since I had already been excused once, pre-scooter.

    So I asked in the bit of the form that gave me space to do so about the accessibility.

    After hearing nothing back and with the date approaching, I called the number to ask directly and see what was up.

    And I was told that I had been excused from jury duty.

    I haven't decided how I feel about this just yet.

    Part of me realizes that they have not just done me a favor, they have done both the prosecutor and the defendant a great favor as well -- if I were either of them I would probably be willing to burn a peremptory challenge on the likes of me, just to keep me off the panel.

    But I am to be honest at a loss as to why it happened, or whether I should be upset or not.

    I mean even ignoring the old joke about how unfair the whole system is since a "jury of one's peers" is unlikely to be found in finding twelve people who weren't smart enough to get out of jury duty, should I be offended?

    Since I'd rather not actually serve, should I just be happy that I have been excused from doing so?

    I should probably just forget about it.

     

    This blog brought to you by(U+3239, aka PARENTHESIZED IDEOGRAPH REPRESENT)

  • Sorting it all Out

    To hai6 or not to xì? 係 is the question!

    • 2 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Regular readers should keep in mind that all I said in The End? still applies; the allusion to the X-Files continues for people who understand such references....

    I had a friend call me and tell me to watch Oceans Twelve last night, after verifying that I was still watching TV with closed captions on.

    Mysterious, but I figured what the hell.

    Right near the beginning I saw what he wanted me to notice.

    Now I have made my peace with the fact that closed captioning does not support Unicode (an issue I have talked about before, e.g. here).

    All of The Amazing Yen's parts were captioned with:

    Yen speaking Mandarin

    I saw it right away, even before the actor could be seen:

    Yet the big inside joke of the Oceans Eleven remake was that Yen was speaking Cantonese, and that only Rusty understood him.

    Mandarin? Was this yet another error in closed captioning?

    Well, it depends, really.

    I mean, by the time Oceans Twelve came along, everyone seemed to understand Yen -- so they had dropped that particular joke.

    And no clues come back from Qin Shaobo (the actor) since he was born in Guangxi, China -- where Mandarin is one of only two official languages but where Cantonese is widely known. So really he might know both well enough for the part (I don't know Cantonese or Mandarin well enough to guess from the small sample, where none of my small vocabulary came up. :-)

    There was one terribly funny joke in there. From the script:

         Yen pops his head out from a small tube and says something in Chinese.

         Frank shrugs...doesn't understand. Yen tries again.... This time he enunciates very clearly and talks very loudly (like Americans do when foreigners don't understand English).

         Frank nods, starts turning the handle of the water pump in the opposite direction. Yen climbs down out of the tube.

    And that is funny. Notice in the movie how Bernie did understand him the second time. :-)

    Compare that to the Ocean's Eleven script:

         Silence. For a moment, each man keeps his two dozen questions or more to himself. At last, one speaks up...

         The Amazing Yen.    In Cantonese.   Of course, no one understands him.    Except Rusty.

                                    RUSTY
                             (in response)
                      No. Tunneling is out. There are Richter scales monitoring the ground for one hundred yards in every direction. If a groundhog tried to nest there, they'd know about it. Anyone else?

         Another silence. Either the guys are too dumbfounded by that bilingual exchange or too numbed by the task ahead of them to speak.

    Any Chinese speakers see any of the three parts who can identify The Amazing Yen's language? Is this a case of an error in the closed captioning content, or a simple change from script to screen for reasons unknown?

     

    This blog brought to you by(U+4fc2, a CJK UNIFIED IDEOGRAPH)

  • Sorting it all Out

    Kind of ironic how Germany seems so okay with Capital *Letter* punishment, huh?

    • 13 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Note that this post is entirely offtopic and if that kind of thing bothers you then you re invited to get out right now....

    Over in the Suggestion Box, mpz asked:

    Suggestion: Write about the new Latin capital letter sharp S introduced in Unicode 5.1.0.

    Fair enough....

    Though to be honest, by the time I get through:

    I think my thoughts on the matter have been pretty much covered.

    It is hard to say how things will go on that last point, as my opinions are fairly controversial and it is just as likely that they will not go in that direction....

    But otherwise, the invention of letters that do not actually exist is quite powerful, as is the decision to ignore intuitive casing behavior or make unrealistic case mappings. Unicode has been doing it for some time and they seem pretty popular.

    The whole issue makes me wonder about how Germany really feels about capital punishment, given all of the capital letter punishment they seem comfortable with. :-)

     

    This post brought to you by ß and(U+00df and U+1e9e, LATIN SMALL LETTER SHARP S and CAPITAL SMALL LETTER SHARP S)

  • Sorting it all Out

    Fight the Future? (#15 of ??), aka Who forgot the culture?

    • 4 Comments

    Content of Michael Kaplan's personal blog not approved by Microsoft (see disclaimer)!
    Note that this post is entirely offtopic and if that kind of thing bothers you then you re invited to get out right now....

    It was back in December of 2006 that I first posted For the [locale] explorer in you...., a blog about the sample that Francois Liger had updated on GotDotNet, Culture Explorer 2.0.

    As it turns out, the timing of the update was very unfortunate.

    Because GotDotNet is no more -- it has been shut down, and as that page mentions:

    Secondly, if you are one of those generous people who had recently uploaded a sample – or multiple samples – to the GotDotNet site, your contribution may be amongst the samples on the MSDN Code Gallery site. We have migrated popular samples (dating back to the beginning of 2007) to the MSDN Code Gallery site.

    So if people like me and others who had been nagging Francois to update the sample had been a bit lazier or less annoying, they might have migrated the sample themselves? :-)

    Of course there are flaws in this theory, since

    • There are any number of people who would pay good money to see me be "less annoying" to them and it has not had much impact so far, and
    • If the sample were hugely popular in the conventional sense into 2007 they probably would have migrated it anyway -- international stuff just lacks visibility sometimes!

    Anyway, recently someone named Michael commented on this fact:

    Is Culture Explorer 2.0 still available anywhere? GotDotNet is gone now, and I can't find the app in the MSDN Code Gallery, CodePlex or anywhere else. Nothing new comes up in searches for the app, or for the author either. Sorry to ask here, but I can't find anything anywhere else, so I thought I'd at least ask.

    thanks,

    michael

    Michael is telling no lies here, and I verified with Francois that he hadn't migrated it under some other name or anything weird like that.

    But he will have the chance to look into it now, at least.

    I do know that the red tape you have to go through to put stuff on these other sites is a bit more effort than GotDotNet was, which makes all of this a bit more of a worry. But I'll get my nag on and see if I can help expedite things. It is a nice sample, after all. :-)

     

    This post brought to you by  (U+1831, a.k.a. MONGOLIAN LETTER SHA)

Page 1 of 4 (49 items) 1234