Blog - Title

January, 2012

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!
  • Sorting it all Out

    Sometimes things are extended in the wrong direction....

    • 0 Comments

    SQL Server's code page, collation, casing, locale, and resource model are all direct attempts to extend the things that Windows provides in ways that make sense for SQL Server.

    This sentence bears repeating, I think. Because it seems (in the words of the late, great George Carlin, vaguely important.

    SQL Server's code page, collation, casing locale, and resource model are all direct attempts to extend the things that Windows provides in ways that make sense for SQL Server.

    Okay, I feel better now.

    Now generally speaking I find the model to be supportable -- for example the way they snapshot the casing table in a given collation version rather than relying on the OS table (which could have disastrous compatibility concerns as they cover multiple versions of collations).

    To be honest, even in those cases when I don't agree with what they do at all -- like the way they split user locale/UI language/formatting/collation differently. Or the way they combine "redundant" collations that can cause geopolitical discomfort/technical challenges -- I still respect their choices because, after all, the own the choices and the consequences thereof.

    There are two problems that I really think they should fix though, in some future version of theirs:

    The fact that the Hosted CLR (aka SQLCLR) doesn't use the SQL Server collations of the server they are on, which truly limits the usefulness of managed code in stored procedures, since you can't code against the server's own behavior.

    The fact that SQL Server makes its "server collation" setting based on a DEFAULT_SYSTEM_LOCALE analogue instead of a UI language of the LOCALSYSTEM account analogue, which blocks collations that are from "Unicode only" locales -- e.g. Hindi -- from working as sever collations.

    There is no good reason for the limitation itself -- what better reason is there to remove it? :-)

    More about this setting here....

    Maybe it's our fault; we did, after all, call it DEFAULT_SYSTEM_LOCALE.

    I guess they just were just extending our naming mistake of yore....

  • Sorting it all Out

    An SDK for the OSK? No way. Though if I may, I'll just say...

    • 2 Comments

    So the other day, I was contacted by hal hubschman, with a message.

    Kind of a request for information.

    It went like this:

    Subject: on screen keyboards

    i read a number of your blog entries which i found informative.  i have been trying to find a osk sdk with no luck.

    do you know if one exists ?

    thank you for the blogs you have created

    hal

    Hmmm.

    I usually hate questions like this.

    Because they force me to answer with a rather hopeless negative.

    I mean, since there is no public Software Driver Kit (SDK) for any version of the On Screen Keyboard (OSK) shipped with any version of:

    • Windows (including the new one in Windows 8)
    • Office
    • Tablet PC

    whatsoever.

    I hate that - very frustrating, why even cover this one in a blog?

    But then, after writing the above, I looked over next to me.

    To my Amazon Kindle.

    On the screen?

    The book I had just finished re-reading: Ender's Game, by Orson Scott Card.

     And I suddenly wondered whether if Ender Wiggins expressed such a depressing thought, if Bean wouldn't push him past this seemingly unbreakable wall.

    As a by the way, and not for nothing, for those of you have never read Ender's Game, by Orson Scott Card, you're truly missing a really good book. You should look into it!

    I suddenly realized I was looking at this the wrong way.

    Each of those On Screen Keyboards, those Soft Keyboards, were designed to wrap the various keyboards provided by Windows and kbdtool.exe, by MSKLC and kbdutool.exe.

    So perhaps there was no SDK to let anyone control any aspect of any version of the OSK.

    But you could, through MSKLC, control any exposed aspect of the OSK's behavior anyway!

    Okay, this is not exactly the answer hal hubschman was perhaps hoping for -- maybe he wanted a way to extend an actual OSK, any OSK.

    The only supported way to do that, however, is to create one's own OSK.

    That would take a bunch of blogs worth of knowledge, though a lot of them technically already written....

    It would be a huge effort to try to string them all together like jigsaw puzzle, filling in the gaps representing missing blogs, though not completely impossible.

    Just highly improbable!

    For now I'll hope that my other answer is sufficient for hal hubschman's question. :-)

  • Sorting it all Out

    If font linking doesn't fit the text to a T (or ț!), a Romanian letter may be right but not quite look it

    • 5 Comments

    I've talked about font linking in a bunch of different blogs over the years, such as:

    and so on. Many of them make a point implicitly that I am going to make quite explicit today.

    And that point is simply The thing that sucks most about GDI font linking is the way it mixes fonts that don't look right next to each other.

    Anyway, the other day colleague and coworker Laura passed on a question people were trying to get a handle on.

    Basically, it was a Romanian product that had strings looking something like this:

    The Ts don't all seem to line up here!

    Let me blow that up in case your browser doesn't make it bigger:

    The Ts don't all seem to line up here, but bigger!

    Now they were claiming to be using Segoe UI.

    And for whatever it's worth, the glyphs of most of the characters look to my amateur eye to be somewhat Segoe-like, perhaps. But not exactly.

    Remember that at one point, there were many more fonts floating around that were missing these characters.

    Perhaps a forensic typographer could do better, a-la İ şéè đêäđ ķéÿš etc.

    But do you see the fourth letter, which is clearly U+0074, aka LATIN SMALL LETTER T, versus the sixth letter, which is clearly U+021b, aka LATIN SMALL LETTER T WITH COMMA BELOW?

    Well, Segoe UI does not look this weird with these Latin letters alongside each other.

    In fact, here is just about every S and s and T and t in Segoe UI, right next to each other:

    Comparig all the Ss and Ts in Segoe UI -- all the same size.

    None come even close to being this incorrect looking....

    For the sake of good looking Romanian text, at least!

    Thankfully, they're going to fix this one, before a negative entry would need to be added to lists like The history of messing up Romanian on computers. If you know what I mean....

    Though even if not, we've come a long way banishing the cedillas from Romanian text, right? :-)

  • Sorting it all Out

    3 x 7 can be a lot more than 21,sometimes!

    • 2 Comments

    So we currently have the Building Windows 8 (An inside look from the Windows engineering team) Blog.

    It is localized into French, aka Français.
     
    And it is localized into German, aka Deutsch.
     
    And it is also localized ino Brazilian Portuguese, aka Português (Brasil).
     
    We didn't skip localizing into Korean, aka 한국어.
     
    And it was also important localizing into Japanese, aka 日本語.
     
    We didn't ignore localizing into Chinese, aka 简体中文.

    And last but not least we localize into Russian, aka Русский.

    As a Blog being localized, the project goals are clear.

    The blog will build interest and enthusiasm among developers, IT Pros, and enthusiasts for the next version of Windows. It is the engineering blog for Windows 8, and the content is provided by Steven Sinofsky and the Windows engineering team.

    For the WWLI content localization team, our goal was to accomplish the following:

    • Build interest and enthusiasm among international developers, IT Pros, and enthusiasts for the next version of Windows by providing localized blog content 
    • Provide parity and consistency of message across the languages selected for localization 
    • Release the localized B8 blog in 7 languages (DE, FR, KO, JA, PT-BR, RU, ZH-CN) no later than 2 business days after the English blog.
    • Build a sustainable, scalable, and cost-effective process for blog localization

    We are now well over 45 blogs in., and feedback has been  really good.

    But we didn't stop there.

    We also have a Windows Store for developers Blog, and we have started doing the localization for the very popular IEBlog as well.

    And each will be localized into those same seven languages!

    Quite a grid we'red building:

    Building Windows 8
    An inside look from the Windows engineering team

    Windows Store for developers IEBlog
    French Conception de Windows 8
    Vision en coulisses de l'équipe d'ingénierie Windows
    Blog Windows Store pour les développeurs IEBlog Français
    German Die Entwicklung von Windows 8
    Einblicke in die Arbeit des Windows-Entwicklerteams
    Windows Store-Blog für Entwickler IEBlog Deutsch
    Brazilian Portguese

    Criando o Windows 8
    Nos bastidores com a equipe de engenharia do Windows

    Blog da Loja do Windows para desenvolvedores IEBlog Português
    Russian Создание Windows 8
    Взгляд изнутри от группы разработчиков Windows
    Магазин Windows: блог для разработчиков IEBlog Русский
    Japanese Building Windows 8
    Windows エンジニアリング チームによるブログ
    Windows Store 開発者向けブログ IEBlog 日本語
    Korean Windows 8 빌드
    Windows 기술팀 내부 모습
    개발자용 Windows 스토어 블로그 IEBlog 한국어

    simplified Chinese

    Building Windows 8
    来自 Windows 工程团队的内部视点
    面向开发人员的 Windows Store 博客 IEBlog 简体中文

    Nothing short of Amazing!

  • Sorting it all Out

    The evolving Story of Locale Support, part 16: We can't scale to a Xishuangbanna Dai locale, but…

    • 13 Comments

    Previous blogs from this series:

    This series has been largely discussing a particular "meta-issue".

    The fact that as our model for locale support is indeed evolving. And much more quickly than usual.

    Some of the blogs in this series capture the "missing links", which can be invaluable since not everything can be deduced from a finished product.

    Examples?

    Part of it is the new keyboards that can support languages for which no locales currently exist, as described in part 2.

    And part of it is in the new list of languages that sits under those new keyboards and supports way more than our locale list can perhaps ever reach, as described in part 15.

    Exciting times, aren't they? :-)

    Well, let's add one of those new keyboards.

    Like the one for the New Tai Lue script/Xishuangbanna Dai language:

    Adding the Xishuangbanna Dai (New Tai Lue) keyboard layout

    Cool!

    Even cooler -- how quickly I typed ᦎᦷᦑᦺᦜᦺᧈ ᦉᦲᧇᦉᦸᧂᧅᧃᦓᦈ with the Windows 8 soft keyboard. :-)

    Admittedly I built the original keyboard layout it was based on -- your mileage may vary....

    If you have the Developer Preview you can see how we are improving here already in supporting finding new language names via some of those script names!

    It's on our Language List now, and everything!

    The updated Windows 8 language list, with that New Tai Lue nt the bottom

    This is awesome!

    But then we ran into a problem when we tried to search for some New Tai Lue script/Xishuangbanna Dai language text in an XPS file we just created.

    Because we were using that New Tai Lue keyboard, the one that the "WinLangDB" list put under the code of khb, for the Lü macrolanguage.

    The search ends up failing since functions like FindNLSStringEx can only handle supported locales by NLS rules, not WinLangDB rules. This is no problem in Notepad which users the default user locale, but a big problem for anyone that tries to be more clever than that.

    In a way, its surprising in s way that it wasn't found earlier. I guess there aren't too many things using the clever way -- we should maybe have more of them!

    Obviously this is a small mis-step in the bold move forward to support things that we don't fully support, and we'll have to figure out what to do here.

    In fact, people are looking at this right now. Since NLS supports all of the underlying characters in the default table, there are lots of possible solutions (note that if nothing were done then you couldn't even search for English text since an "invalid" locale name makes the NLS functions fail!).

    I mean, since we lack the resources at this point to add a Xishuangbanna Dai locale. :-)

    Of course this is much bigger than New Tai Lue -- the WinLangDB list supports such a huge subset of valid ISO-639-3 codes that doing nothing would hurt even more than just this one case!

    But truthfully I'm not worried here -- 15+ steps forward and one step back in the pre-release has plenty of time to be changed to 16+ steps forward in the actual release.

    Alternately, if they don't fix it then it will make a great KB article, maybe. :-)

    And either way, I'm proud to be a part of those step in our evolving Story of Locale Support....

  • Sorting it all Out

    You can do CESU-8 if you need to; we went in a slightly different direction....

    • 6 Comments

    Regular reader Dan asked me via the Contact link:

    We just upgraded our customer desktops from Windows 2000 to Windows 7, and we're seeing a major break in our text processing app.

    We've debugged the problem pretty thoroughly, and it doesn't look it's our app at all. Notepad seems to be breaking our Plane 1 and Plane 2 text. Which seems like it must be impossible, isn't support of supplementary characters a Windows 7 feature?

    I'll admit I was confused at first, though the problem he described seemed kind of familiar.

    And then it came to me.

    The app was basically supporting supplementary characters on Windows before we really were.

    Well, that isn't entirely accurate.

    But the was a weird time before XP shipped that we were okay with the six byte form for supplementary characters, before Unicode got more explicit about considering it to be ill-formed and before we started conforming to Unicode's stricter definition....

    Dan's Line of Business app was essentially using CESU-8, not UTF-8. And given the weird difference between how Notepad initially detects UTF-8 and how it converts the data -- described previously in blogs like (It wasn't me) -- and the solution becomes clearer.

    • Either move to the 4-byte form of UTF-8 supplementary characters, or
    • Do all off the conversion outside of Notepad and Windows!

    Personally, I'd recommend the former option -- the latter is kind of contrary to what Windows, Microsoft, and Unicode are doing these days.

    Though if an application has a heavy investment in the 6-byte form, then as long as it is kept internal to the app (or properly marked when communicating with those who understand it), it isn't the end of the world....

  • Sorting it all Out

    It is *not* called the Desert Desert, dammit!

    • 16 Comments

    You probably know people, or work with people, or are one of those people, who calls it an "ATM machine".

    And you may know people, or work with people, or are one of those people, who calls it a "Light Emitting LASER".

    Hopefully you laughed when the mobsters from the movie Mickey Blue Eyes noted the sign on the restaurant "The La Trattoria", despite the fact that the meaning of the phrase was "The The Trattoria", as a sign of irony at he faux intellectual bravado leading to ignorance of a proper name of a fancy FrenchItalian restaurant.

    Perhaps you were around to chuckle with Mike or Cathy or I when a particular goup admin used to refer to "Very VIP people", realizing the meaning of "VIP" inlude the word being repeated.

    Many people I know found it mildly ironic that natives and former natives of Iran wanted their language فارسى (spelling AFEH ALEF REH SEEN ALEF MAKSURA) called Persian rather then Farsi, even though the native name was [pronounced FARSI by those same native speakers.

    And yet, I do not want to know if you make this other mistake.

    I am going to say La La La I am not listening to you yet you ar still talking if you try to confess you are guilty of it.

    And after you have read this blog today you will be able to point out the mistake any time someone else makes it.

    Anytime you see Google Maps refer to locations as being in the "Sahara Desert" like so:

    Because the word for Desert in Arabic is

     صحراء

    aka ARABIC LETTER SAD + ARABIC LETTER HAH + ARABIC LETTER REH + ARABIC LETTER ALEF + ARABIC LETTER HAMZA

    aka SAD HAH REH ALEF HAMZA

    aka SAHARA.

    Because you know it is silly to call the sandy, hot region in Northern Africa the Desert Desert.

    You aren't that silly, right? :-)

  • Sorting it all Out

    …wondering about Paul: where he could be, who he's with, what he's thinking, and if he'll ever return someday…

    • 9 Comments

    Cue somewhat gratuitous Hotel La Rut video:

    At least I know Paul was thinking of me, so I don't have to wonder about that part!

    You see, the other day, my friend (and former colleague from Microsoft) tweeted to me over Twitter:

    Ah, a somewhat under-documented bit of info, that.

    It involves LPKSETUP.EXE.

    Let's try running it:

    A nice dialog pops up, whether Vista or Windows Server 2008 or Windows 7 or Windows 2008 R2.

    It looks like this:

    Now obviously we want to Install display languages (if anyone wants to look into either the Uninstall display languages or How do I get additional display languages? options, they can do so, of course!).

    Anyway, once you choose to Install display languages, you'll see the option to Choose your method of install:

    That Launch Windows Update option may be of some interest, in other circumstances. After all, it points out the other way to find Language Packs.

    But for now we'll stick with Paul's scenario, and Browse computer or network. How better to Locate and install display languages manually, anyway?

    Here we go:

     

    Okay, I'll now explain a little bit of how this dialog works.

    You choose a directory.

    And then it will traverse that directory and its subdirectory (and no further subsubdirctories!) to look for Language Packs or Language Interface Packs!

    I took the trouble to copy 16 different lp.cab files to my local Windows 7 machine, in a directory structure I have never personally witnessed:

    Now when I browse to the subdirectory, the magic happens.

    Well, I guess I can't call it "magic" just like I can't call a card trick "magic" if I have to manually change the order of the deck in front of you to make the trick work!

    Anyway, you'll see it detected a bunch of valid lp.cab files:

    Only 12 of them are valid for my Windows 7 x64 machine; the other 4 are Windows 7 x86 machines -- though they could have also been other Windows version Language Packs/Language Interface Packs;they'll all fail here.

    I suppose this explains why LPKSETUP.EXE doesn't traverse any deeper -- since it is taking the time to open up every CAB file (I renamed several of them to be sure) and see if its valid. Traversing too deep could potentially start to get painful!

    Note to CSS: A Microsoft Knowledge Base article aimed primarily at IT folk/system builders explaining how to properly make use of LPKSETUP.EXE to allow any language selection the IT folk/system builders choose to make appear to their users (with the proper steps to work properly with secured desktops) may be the most completely awesome-est KB article of the year! Any takers?

    Okay, so that's it. Hopefully it will answer Paul's question.... :-)

    I wonder what he's been up to. who he's with, what he's thinking (using Vista?!?), and if he'll ever return someday. Don't you?

    So I'm just wondering about Paul -- what he's been up to, who he's with....

    I suppose I am in a La Rut myself!

  • Sorting it all Out

    Avoiding the Snowpocalypse!

    • 1 Comments

    I originally had a seat on the 8:30am flight to Las Vegas.

    and then God apparently said no, not so much.

    She made plans to dump up to a foot of snow or more on Seattle on Wednesday.

    Crap.

    Alaska Air was a bit short-sighted, and they charged me $100 for the change, but I decided to follow in the footsteps of brave Sir Robin:

    Brave Sir Michael ran away
    Bravely ran away.
    'Ere mounds of snow dumped upon his head, he bravely turned his tail and fled... 

    I'll try to blog something substantive tomorrow.

    Though I can't guarantee it will be as entertaining!

     

  • Sorting it all Out

    The evolving Story of Locale Support, part 15: Fixing our listings up in Windows 8!

    • 16 Comments

    Previous blogs from this series:

    Windows support of locales, and in fact the whole locale model in Windows is impressive.

    It's substantial.

    And....

    It's as confusing as all get out!

    I mean, even almost seven years ago when I wrote What is my locale? Well, which locale do you mean? to list and define all the different kins of locales in Windows:

    • DEFAULT USER LOCALE (Windows XP term: "Standards and Formats")
    • DEFAULT SYSTEM LOCALE (Windows XP term: "Language for non-Unicode Programs")
    • DEFAULT USER INTERFACE LANGUAGE (Windows XP Term: "Language used in menus and dialogs")
    • DEFAULT INPUT LOCALE (Windows XP Term: "Default Input Language")

    I was self-consciously aware of how confusing everyone found all this.

    Now virtually everyone I talked too agreed that each term was entirely explainable, especially in Windows XP and later when they were each given new terms that didn't use the same word LOCALE over and over again.

    But the only ones who were willing to call this motley crew intuitive were completely and totally high at the time.

    And I'll be honest, the ones unwilling to call it intuitive were right.

    The model, as expansive and feature-filled as it may be, is incredibly confusing.

    The previous changes aimed at incrementally improving terminology were perhaps worthwhile, but ultimately unable to solve the real problem.

    Until Windows 8....

    Now first they take the old Regional and Language Options:

    (shown here from Windows 7) to start.

    Now instead of one Control Panel Applet, there are now two. in the Control Panel:

    One for Region, with just three tabs, none of which say Language:

    and one new one, for Language:

    Now this Language Control Panel Applet is for User Interface Languages (if they are installed), for another language specific services (if they are installed) and for Keyboard Layouts (whether atop actual hardware keyboards or soft keyboard layouts).

    You can see the new Keyboard List right here - notice the order is the same as from the Language applet, above:

      Now this does start to thin the herd in a more meaningful way.

    Though speaking for myself it is an odd direction when you consider that both the Formats list is configured over the Region applet, and the Keyboards list is configured in the Language applet, and that both of their built-in lists are considerably larger than User Interface languages or any other services.

    Though now with changes like the ones described in part 2 (raising the roof on keyboards), the keyboard list is now no longer completely limited to "supported locales" that populate the Formats List anyway.

    So perhaps my concerns about the mode of disconnect are unwarranted. :-)

    I will conditionally consider this to be a good evolutionary step that will simplify setting up Windows for typical users -- whether chnging UI language, adding keyboards, or whatever.

    In the long run, I think the direction here will only get better and better over time.

    Now in future parts, I'll dig in further here, looking at programmatic new means of getting information....

  • Sorting it all Out

    I'm reasonably certain that those who disagree with me here are wrong!

    • 5 Comments

    So, the other day, I wrote How to detect if a locale is Bidi, Windows 7/8 edition.

    This is a topic I had covered a bunch of times over the years, in many prior blogs, from How To [NOT] detect that a locale is bidi to How To detect that a culture is bidi to Cue the smarter version of GetDateFormat... ok, it's a wrap! and so on.

    Most of the Win32 answer prior to the introduction of LOCALE_IREADINGLAYOUT was using the LOCALESIGNATURE.

    More specifically, bits 123, 124, and 125 of the Unicode Subset Bitfields:

     Bit  Meaning
     123  Windows 2000 and later: Layout progress, horizontal from right to left
     124  Windows 2000 and later: Layout progress, vertical before horizontal
     125  Windows 2000 and later: Layout progress, vertical bottom to top

    The combinations of different values of these three bits make the description of almost any text directionality outside of Boustrophedon (or Rongo-Rongo) possible:

     Bit 123  Bit 124  Bit 125 Text Rendering Direction IREADINGLAYOUT equivalent
    0 0 0  LeftToRight, then TopToBottom 0
    1 0 0  RightToLeft, then TopToBottom 1
    1 1 0  TopToBottom, then RightToLeft  2
    0 1 0  TopToBottom, then LeftToRight 3
    0 0 1  LeftToRight, then BottomToTop n/a
    1 0 1  RightToLeft, then BottomToTop n/a
    0 1 1  BottomToTop, then LeftToRight n/a
    1 1 1  BottomToTop, then RightToLeft n/a

    Now as the last column hints at, the four reading layout choices we support are all completely able to be derived from the LOCALESIGNATURE bits.

    The additional four rendering options theoretically able to be captured by these bits but not available to the new flag are not used as as primary rendering for any language we support.

    A part of me wishes both

    were entirely derived from these three bits, since doing that directly only satisfies my inner database developer that hates storing repetitive data in multiple places.

    Of course in a mature society, there is room for disagreement, but in this case I'm reasonably certain that those who disagree with me here are wrong. :-)

    Since had we done it sooner, it might have prevented us from shipping managed code bugs like the one described in It's not right when IsRightToLeft is wrong, and native code bugs like the one described in Double Secret ANSI, part 2 (the brokenest one yet, sorry 'bout that!).

    Because the best way to make sure the data is correct is to use the data.

    Early and often, as both those bugs that made it to shipping products prove quite effectively (to our detriment at the time).

    To be perfectly honest, I wish we would make this change even now, because we will always consider any differences between these three different items as a bug, as the best way to make sure that they don't fall out of sync is to use one source for all of them.

    We could in theory make this change later this week to the data behind the properties.

    Now I am an owner of the data, but this would also be code to change (in multiple products across multiple divisions). I can appeal to the owners to fix the long-term problem sync though.

    Before that, we we can even fix the problem I mentioned the other day in How to detect if a locale is Bidi, Windows 7/8 edition, where we stop returning results that are incorrect 99% of the time (claiming verticality for CJK and Mongolian), by default...

    Technically, I could have used that idea and made this another part of the "The evolving Story of Locale Support" series, but I'm not confident that everyone will agree, so who knws whether we'll evolve that way, yet!

  • Sorting it all Out

    How to detect if a locale is Bidi, Windows 7/8 edition

    • 1 Comments

    The other day, I was forwarded a question by a colleague.

    They wanted to know why the following cod was not letting them detect Bidi properly:

    UINT ret = GetLocaleInfoEx(LOCALE_NAME_USER_DEFAULT, LOCALE_IREADINGLAYOUT, Layout, ARRAYSIZ(Layout));
    if(ret && (Layout[0] != L'0')) {
        // Treat the locale as Bidi: WARNING: THIS CODE IS WRONG!
    }

    Now this code is bad for several reasons.

    First of all, let’s look at the doc topic for LOCALE_IREADINGLAYOUT:

    Windows 7 and later: The reading layout for text. Possible values are defined in the following table.

    Value

    Meaning

    0 Read from left to right, as for the English (United States) locale.
    1 Read from right to left, as for Arabic locales.
    2 Read vertically from top to bottom with columns to the right, and read from right to left, as for the Japanese (Japan) locale.
    3 Read vertically from top to bottom with columns proceeding to the right, as for the Mongolian (Mongolian) locale.

    Now obviously code that assumes everything other than 0 is RTL is going to have problems.

    But okay, that is an easy fix.

    There is a deeper problem, one that is a bit more insidious….

    Let’s start by looking at what locales fall into each category:

    Value Locales that have this value
    0 English, Russian, Thai, etc.
    1 Hebrew, Arabic, Persian, etc.
    2 Japanese, Chinese, Korean
    3 Mongolian

    Now in practice, vertical support in applications is such that the majority of text in Mongolian, Japanese, Chinese, and Korean should be treated as if it was in that first category slong with English. Therefore, returning 2 or 3 is a kind of unrealistic idealism, which ultimately makes the code even more flawed than previously thought!

    There is more to the story beyond this one buglet, and I’ll be talking about this further, soon….

  • Sorting it all Out

    No, I am your father, Chris!

    • 5 Comments

    The date was September 19, 2006.

    The time was 1:56am.

    The mail said:

    Obi Wan Kaplanoibi,

    There must be a way to force Windows to render Tahoma at 8pt whenever it wants to render it at 9pt - systemwide?

    You're my only hope.

    Chris

    A mere 32 hours later, it turned into Shrink that font -- automatically?.

    Things were a bit more "rough and ready" in those early days!

    Now I doubt I ever even set eyes on the email since that time.

    Until last night.

    You see, last night I was looking for a different email from around that time.

    And I saw the original email address.of Chris.

    It was chris@lockergnome.com.

    Call me crazy, but I'm reasonably certain that was (is?) Chris Pirillo.

    Nw if you look back in the Wayback Machine at http://www.pirillo.com/ you see that the site first started in 1998. he had just graduated recently from the University of Northern Iowa.

    When he wrote to me originally? He was a student there. It may have even been related to a homework assignment!

    The relstive speed of my response was obviated by the fact that I never replied directly to him and didn't answer the question, choosing to tell a story his mail reminded me of, instead!

    I had never met him in person until just a few years ago - a friend of Ellen's.

    But y then I certainly knew who he was!

    Now I can mostly blame the Star Warsiasn feel of what popped in my head when I saw this on his "Obi Wan" reference.

    But what I imagined was him saying to me was (in a mechanically modified James Earl Jones voice):

    "I've been waiting for you, Obi-Wan. We meet again, at last. The circle is now complete. When I left you, I was but the learner; now *I* am the master."

    And of course the quote in the title.

    Now I'm not gonna call Chris evil, but he did move over mostly to Apple stuff years back. So it's a relative thing... :-)

  • Sorting it all Out

    Better know an Exec part 2 (Quality time with Rich Kaplan, his wife Karmann & 13 others)

    • 2 Comments

    Previous parts in this series:

    Today's Exec is not a stranger to m.

    In fact my connection to Customer & Partner Advocacy Corporate VP Rich Kaplan 

    dates back to August of 2007 and an irate customer who was cold calling every Kaplan in the phone book trying to accost him!

    (I told the story previously in Are you Mr. Kaplan?)

    He and I would then bump into each other over the following years, like at summits and presentations and such.

    One of those times, we talked about our shared name, and I told him my "Ellis Island" story. told previously in My name is Dragutsky. Michael Dragutsky.

    Another time I was having lunch in the Commons with Ellen and he came up to me introduced his parents (two other Kaplans!).

    He is the first Exec I talked with about my desire to see the corporate matching ceiling raised, and as one of those people who contributes heavily to Fred Hutch, he definitely knows where I am coming from

    My friend Jenny Lay-Flurrie who I met a few years back talking about education about accessibility, just recently took on new role in the Customer and Partner Advocacy org, working for Rich.

    You might say that within one conversation/two emails with her and one with Rich, I began to really crystallize what would become points 3, 4, and 5 from What I'd do with my 'Microsoft 20% time'!

    As I bounced ideas off them, my more nebulous aspirations began to take shape, so perhaps you may want to think of some of this blog today as an origin story, of sorts!

    It started right after Hanging with Marlena Werder + her skip & skipskip level!.

    I was taking a taxi from the Woodmark to Canlis, because despite the fact that I was traveling from spending the day with three VPs and a Leadership Teem to have diner with another VP and his wife, Shuttle Services simply couldn't accommodate my needs.

    {Insert eye roll here! I get the feeling that if I was leaving Steve t meet Bill they still wouldn't be able to help me!}

    Anyway, with the help of a cab driver who was promised a hefty tip he could get me to my destination before dinner started¹ ².

    Dinner was with Rich, his wife Karmann, and seven other couples. In a private room.

    I felt a little bad because I think I outbid M3 Sweatt for my spot, and M3 is a guy who just a few eeks later said stuff about me like this on Twitter:

    Hopefully he'll be able to forgive me eventually!

    I called Cathy a few weeks prior, and she was willing to ignore the fact that we'd be out late on a Thursday to be my +1 when the original +1 proved unavailable. She even made it there before I did, actually!

    And Jenny was there too, with her husband. I was surprised since she wasn't on the original list (they got to sub or another couple who had a conflict. I was grateful since she was going to be the next person asked if Cathy couldn't make it, and if that had happened, I wouldn't have met her husband!

    Altogether, the seven couples raised $3,820 -- which Rich and Karmann were going to match over and above the Giving Campaign matching, and donate it to the Fred Hutchinson Cancer Research Center, where they are on the Ambassadors’ Board.

    He also noticed my name on the guest list for the Fred Hutch Gala, and we talked about that for a bit (he got me a seat at a "Softies" table for the event, which was coming up in just a couple of days. :-)

    And of course the amazing dinner on that night a Canlis.

    I'll put up the menu to get you ll jealous!

    Appetizers:   1. Oysters with Red Wine Mignonette

                          2. Crab Cakes with Lobster Coral

                          3. Flashed Seared Teriyaki Tenderloin

     

    Canlis Salad  “100 Best Dishes in America”- Saveur Magazine

     

    Choice of Entrées:     1. Wild  King Salmon

                                        2. Forest Mushroom Risotto

                                        3.  Naturally Raised, Prime Filet Mignon & Prawns

     Dessert: ”Perfect Trio”

    “No Holds Bar”

    Yum!!!!!

    I also conveyed greetings from Marlena and Barbara who I had recently taken my leave of.

    Several people were amused at the schedule I set myself up for that day, not that I had much choice....

    In the end, this was the single most impressive December 1st of the 41 I've experienced in my lifetime thus far.

    I may retire December 1st's jersey at this point!

    The evening an thie particular Exec and his wife provided something that was more than anything else about the spirit of camaraderie and generosity .

    And a very tangible (and delicious!) reward for it!

     

    1 - I made it bfore dinner.
    2 - He got that tip.

  • Sorting it all Out

    It's not that they're putting the Pressure on Windows, but maybe the Pressure.Net? :-)

    • 12 Comments

    The other day when I blogged abut how If someone blathers on about how Windows supports Unicode, you can suggest they just ZIP it, if you like!, I didn't tell the whole story.

    I focused on Windows, and a peculiar engineered schizophrenia that is Windows compressed folders, the most non-Unicode piece of Windows outside of stuff that was yanked out of the product years ago.

    But if you look to Windows 8, there is a component shipping with it that is aspiring to do more.

    I'm referring to the .Net Framework, whose version 4.5 has added the new ZipArchive Class to the venerable (for .Net, at leas!) System.IO.Compression Namespace.

    Done right, too -- using UTF-8 so they can support whatever kind of characters you want -- anything in Unicode!

    The code was in no way connected to the code Windows licensed for the Compressed Folders feature.

    And you know how Windows isn't isn't allowed to provide a programmatic way to get int those compressed folders?

    Well everything you do in .Net is something programmatic!

    On top of all of that, you can install the Microsoft® .NET Framework® 4.5 Developer Preview - Full on Server 2008, Windows 7, or Server 2008 R2 as well as just using it on the Developer Preview of Windows 8.

    Amazing!

    There must be a catch though, right?

    Don't worry, there is.

    The catch is most easily described by comparing/contrasting the CP_ACP code page-based support of ZIP in Windows and the UTF-8-based support of ZIP n the Developer Preview of .Net 4.5.

    These two technologies, much like the two encodings supporting them, have one thing in common.

    And when I say one thing, I mean 127 things.

    That's right -- what they have in common, all they have in common, is ASCII.

    So although .NET is doing the right thing here, it s essentially only doing the right thing if either:

    • everything is English only, or
    • People use .Net to build the original .ZIP files, and decrypt them, themselves.

    Otherwise, I suppose bugs like System.IO.ZipArchive zipped only UTF-8 encoding are just going to be par for the course in markets like Japan, where all that JIS X 213 and IVS and Emoji are important, lots of people are still happy within their own code page, at least.

    This suggests one of the big problems of having no user interface:

    You have no user interface!

    Damn.

    So we still have this flaw in Windows, though this design means that for many customers it is too easy to be dragged down in .Net, too.

    Now of course the is not too much of a sign that .Net wouldn't try to help here, to at least bridge the scenario a little.

    Or maybe they could take advantage of some of that new managed side by side stuff and provide a couple of context menu items for zipping and unzipping files? Though that approach would also be fraught with peril an confusion for many people in the Long run, unless they literally took over the ZIP handling for always, even though they localize into just ~10% of the languages of Windows.

    They would be saviors, but not everyone would fully appreciate their largess.

    Perhaps including Windows....

    Or maybe Windows could do this same thing themselves and use the managed code ZipArchive Class. I mean, they know about shell handlers than anyone in the universe, and they have a much larger Extent Of Localization. Hooking up .NET here might be caper than fixing their ZIP problem they never have fixed yet.

    They could take the Notepad approach to Unicode support:

    • Do it using the CP_ACP
    • If you at any point ever detect anything off the codepage, show some magic UI (fix the text to more reasonably capture what it would do, of course!):


    • Then use .Net in this case, as needed.

    Not perfect, but certainly it has been good enough for text files in Notepad since the Windows 2000 beta!

    Or they could just fix the bug they have now....

    Okay enough spitballing, you get the idea.

    The current work highlight both the best and the worst of two different business units within Microsoft.

    Looking at that bug report I mentoned above, the needs are expressed pretty clearly:

    Greg,
    Here is what Mr./Ms. Kamegawa wrote back (with some edit by me). Hope it helps.
    I think you've already answered most of the issue, but could you please recap it and/or add any comment?
    (Personally I would agree with him/her. Whatever the ZIP file specification is, it seems like the de facto standard in Japan has been MBCS ZIP files. For a very long time. It is great that the .NET API uses UTF-8 for globalization, but I'm afraid the lack of a viable option to create MBCS ZIP files would make the .NET API practically useless in Japan.)
    Thanks.

    ---
    [Summary]
    System.IO.Compress.ZipArchive stores MBCS file names in UTF-8.
    Windows Explorer can't handle UTF-8 ZIP files.
    So the ZIP files compressed by System.IO.Compress.ZipArchive are not extracted correctly by Windows Explorer.
    Windows Explorer should support extracting UTF-8 ZIP files. Otherwise System.IO.Compress.ZipArchive should support storing MBCS file names. And the latter seems more practical.

    [Background]
    Most Japanese users extract Windows Explorer to extract ZIP files. Japanese file names are used so frequently in Japan that the incompatibility between Windows Explorer and the ZIP files compressed by System.IO.Compress.ZipArchive is unacceptable.

    [Repro Steps]
    1. Create Japanese named files in c:\temp\.
    2. Compress them into ziptest.zip using the code below.
    --- Begin ---
    using (var zip = new ZipArchive(@"c:\temp\あ\ziptest.zip", ZipArchiveMode.Create)){
    var files = new DirectoryInfo(@"c:\temp").GetFiles("*.*");
    Array.ForEach(files, x => zip.CreateEntryFromFile(x.FullName, x.Name));
    }
    --- End ---
    3. Extract ziptest.zip using Windows Explorer.
    -> The Japanese file names are corrupted as Greg depicted before.

    [Expected Behavior]
    Twofold:
    1. Make Windows Explorer support extracting UTF-8 ZIP files.
    2. Make System.IO.Compress.ZipArchive support compressing ZIP files using MBCS or the system locale.

    The best would be 1. But I don't think it possible to make all the widespread Windows versions (i.e. XP/Vista/7/2003/2008/2008 R2) support UTF-8 ZIP files.

    So I would like to ask for 2. It will provide the maximum compatibility with Windows including the legacy-but-widely-used versions of Windows.

    I could hardly say more here than this.

    So no matter how you look at it, the new ZipArchive Class of the Microsoft® .NET Framework® 4.5 Developer Preview - Full is certainly putting the pressure on them to do the right thing, huh?

    Or at least the Pressure.Net!

Page 1 of 2 (21 items) 12