Blog - Title

May, 2007

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!
  • Sorting it all Out

    Something .NET does less intuitively than they ought

    • 2 Comments

    If you are someone who reads the BCL Team blog, you may have seen Josh Free's String.Compare() != String.Equals() that he just posted.

    Of course this is old hat if you are a regular reader here and remember seeing Invariant vs. Ordinal, the third or Something .NET does more intuitively than Windows, both posted last year.

    Even just today, I was asked by someone to provide some comments for this MSDN topic to clarify something in the comments for the String.Compare Method (String, String, Boolean) overload.

    The specific question related to which StringComparison enumeration member corresponded to that third ignoreCase parameter. Which is not clear to everyone. But then that whole overload wasn't necessary (more on this in the future).

    And I just realized that I am sick of all of the extra overloads off of System.String, the StringComparison enumeration, the StringComparer class, and all of the rest of the confusing methods that are there, all of which should have and could have been replaced with a simple usage of CompareInfo for linguistic comparisons.

    The whole reason methods kept getting added is that although they found the one method with an enumeration confusing, they found the one method with no options to be too limiting. So they started adding overloads and methods and named them such that no one could ever know which one to use without reading a fifteen page document that no one understands, not even the really smart developers.

    "But Michael," they tell me, "the System.Globalization namespace is not referenced by default." This is an argument I refuse to by since every time there is an interface that is important feature, it does get added by default, like Generics did in VS 2005. So System.Globalization is clearly not important enough to include, but it is important enough to wrap dozens of different ways that no one understands.

    Grrrr.

    Ok, I am over it now....

    That is part of letting go when you work in new areas like I do now -- you have to try to not jump into the old area all the time if you think people might be doing something that you don't care for....

    Though on the plus side, I do get to help out some teams work through these issues (I'll blog about this soon), and I suppose I still get to complain here.... :-)

     

    This post brought to you by Щ (U+0429, a.k.a. CYRILLIC CAPITAL LETTER SHCHA)

  • Sorting it all Out

    Cutting the cord while someone else is shoring it up

    • 12 Comments

    Vista ships with a bunch of IMEs, as previous versions of Windows did.

    And even though they use the Text Services Framework (TSF) rather than the venerable Input Method Manager (IMM), there was no big set of "fake KLID" values added to the registry to make them easy to load via a call to LoadKeyboardLayout.

    Which is technically a breaking change from Windows XP and Server 2003 (which had several IMEs using this "bootstrap" methods of loading via LoadKeyboardLayout).

    (We can add this break to the one I talked about in What broke the input language messages?, I suppose....)

    I was thinking about this change a while back when William Rollison asked me:

    Do you, or do you know of someone who has some wrappers or examples of how to programmatically switch keyboard layouts and activate the Japanese IME in C#?

    (The question itself is one I'll provide some answers for another time -- since it involves for the IME a TSF sample!)

    An interesting factoid popped up when I installed Office 2007 on my Vista machine though -- it actually added three TSF IMEs and gave them those fake KLID values:

    • Microsoft Pinyin IME 2007 for Simplified Chinese (KLID E0200804)
    • Microsoft Office IME 2007 for Korean (KLID E0200412)
    • Microsoft Office IME 2007 for Japanese (KLID E0200411)

    Interestingly enough, I have been told that all three of these IMEs were developed by the same folks who developed the Vista IMEs and that in all three cases they are minor upgrades. The KLID values are there because Office 2007 also installs on previous versions of Windows, and there was a consistency goal to try and meet.

    So it does make one wonder why these pseudo-KLID values had to be removed if there was no actual architectural reason to do so. Seems like a bug to me, especially when the same team proved that they were able to make it work when they tried....

    It is a little odd for one doctor to cut the cord if another one is going to strengthen the cord for another baby, in any case. Especially if it is doctors in the same hospital basing the decision on different insurance plans.

    Or maybe I took the analogy too far. :-)

     

    This post brought to you by (U+30dd, a.k.a. KATAKANA LETTER PO)

  • Sorting it all Out

    Nothing stinks worse than the thread locale, other than the thread code page

    • 14 Comments

    The piece of mail I got (via the Contact link) from Ken was:

    Hi Michael,
    I have run into what I believe is a bug in MultibyteToWideChar() and WideCharToMultibyte() when the code page parameter is set to CP_THREAD_ACP, 'default language for non-Unicode applications' had been set to Hebrew.   This is seen when using the utility macros in atlconv.h like T2WC.

    I've created a simple test app that shows unexpected results on some systems.  The code page inferred from CP_THREAD_ACP is not the same as GetACP().  I have reproduced this on two different systems set to use Hebrew, but not on two other systems set to use Traditional Chinese - one of which was set to the full Traditional Chinese localized UI.  The source is part of a default .net 2003 generated console project.

    #include "stdafx.h"
    #include <ostream>
    int _tmain(int argc, _TCHAR* argv[])
    {
        std::cout << "default code page is " << GetACP() << std::endl;
        std::cout << "_AtlGetConversionACP code page is " << ATL::_AtlGetConversionACP() << std::endl;

        CPINFOEX cpinfo = {};
        GetCPInfoEx(ATL::_AtlGetConversionACP(), 0, &cpinfo);

        std::cout << "Thread code page is " << cpinfo.CodePage << std::endl;
        return 0;
    }

    My results are:
    default code page is 1255
    _AtlGetConversionACP code page is 3
    Thread code page is 1252

    When my application calls T2WC, the results are incorrect and the codepoints are extended to 16 bits, but not converted to their Hebrew codepoints.  We are getting around this by using _CONVERSION_DONT_USE_THREAD_LOCALE, but I had wondered if others have heard of this problem before.

    Thanks for your time,
    Ken 

    Regular readers may recall when I pointed out Why I think the thread locale really stinks.

    (In fact, I was asked not too long ago to help clean up some of the bad usages of the thread locale in various parts of Windows in shell32.dll and shlwapi.dll, something I will probably be working on shortly!)

    Anyway, after Ken pointed out that the use of _CONVERSION_DONT_USE_THREAD_LOCALE works around the problem, it seems pretty obvious that CP_THREAD_ACP is none other than the LOCALE_IDEFAULTANSICODEPAGE as returned by GetLocaleInfo with the return of GetThreadLocale as the LCID.

    Now the thread code page is a pretty shaky thing, and not only for the reason that make me feel like the thread locale stinks. Imagine basing code page conversions on something that any code running in the thread can change any time. Yuck!

    In fact, it is downright nasty that ATL and MFC made a breaking change in version 7.0 in this area (as described here):

    String Conversions

    In versions of ATL up to and including ATL 3.0 in Visual C++ 6.0, string conversions using the macros in atlconv.h were always performed using the ANSI code page of the system (CP_ACP). Starting with ATL 7.0 in Visual C++ .NET, string conversions are performed using the default ANSI code page of the current thread, unless _CONVERSION_DONT_USE_THREAD_LOCALE is defined, in which case the ANSI code page of the system is used as before.

    Note that the string conversion classes, such as CW2AEX, allow you to pass a code page to use for the conversion to their constructors. If a code page is not specified, the classes use the same code page as the macros.

    For more information, see ATL and MFC String Conversion Macros.

    Yuck. I hate breaking changes that are bad. And this is definitely one of them. :-(

    Sorry Ken, the strange differences here are kind of by [bad] design. And your workaround is actually the fix here -- it works around what I consider a breaking change that breaks a little bit of ATL here.

    In the end, my best advice is to NEVER use either the thread locale or the thread code page. For anything. Ever....

     

    This post brought to you by װ (U+05f0, a.k.a. HEBREW LIGATURE YIDDISH DOUBLE VAV)

  • Sorting it all Out

    Usage (customer intent) vs. Design (developer intent)

    • 2 Comments

    The Monday morning note I got via the Contact link was:

    Something that was pointed out to me: Windows standard (UK/US at least) keyboard layouts translate particular control+symbol sequences into particular characters, for example Ctrl+[ becomes ESC (27). 27-31 are mostly mapped to the corresponding control characters after 'Z' except that 31 appears to be duplicated onto Ctrl+_, Ctrl+- and Ctrl+/.

    MSKLC generated layouts do not do this, and only have ASCII 0 mapped to Ctrl+2, ASCII 27-31 mapped to Ctrl+3 to Ctrl+7 and ASCII 127 mapped to Ctrl+8. Standard layouts have these too, but not as many people are aware of them.

    I tried playing around with several things including upgrading from MSKLC 1.3 to 1.4, but in the end only adding explicit mappings for these control sequences worked. The built layout does the right thing, but MSKLC whinges during build that "This is redundant and should be removed." Which it isn't, and shouldn't, unless MSKLC were to perform that mapping automatically.

    (As to why it's important: telnet/ssh users tend to complain if you don't get it right!)

    A small number of these (three of them) were fixed in MSKLC 1.4, specifically to fix the windows telnet case. But beyond that, a keyboard author is in their own, and every day one finds new problems with character assignments in the CTRL and CTR+SHFT shift states, even though I have been speaking out against them before MSKLC was even in alpha, and MSKLC has been complaining for almost as long....

    But if you truly want to assign something there, MSKLC won't stop you. And it will even keep a few explicit assignments when you load from an existing layout and not even warn you about them, at least not in 1.4.

    But beyond that, MSKLC definitely was aiming in a slightly different place in its design, somewhere different than either telnet or ssh. So if one gets a few extra validation warnings and decides classify them as whinging or whining, then I guess we'll all just have to live with that, now won't we? :-)

     

    This post brought to you by [ (U+005b, a.k.a. LEFT SQUARE BRACKET)

  • Sorting it all Out

    On installing fonts

    • 0 Comments

    Mihai Suba asked over in the Suggestion Box:

    Is there a way to make Windows aware of a font copied in WINDOWS/FONTS by a program, without a restart or click-open that folder?

    For reference, I'll pull out the following two blog posts in a series entitled About the Fonts folder in Windows:

    This technique of opening the folder and copying the file there programmatically is now history as of Vista (as pointed out in part 3 above), and there was no documented mechanism to ask the fonts folder to do this work other than letting it detect the change and process it, in earlier versions.

    Parts 1 and 2 talk about the supportable way to make this work across all versions of Windows.

    And if you want the truly supported version that does all the work for you, fontinst.exe can be retrieved from the Microsoft FTP site here (just right click on the link and save it somewhere).

    Unfortunately all of the samples that have the created fontinst.inf seem to have gone missing from the Microsoft site, so you'll have to search around if you want to find one the samples....

     

    This post brought to you by (U+1783, a.k.a. KHMER LETTER KHO)

  • Sorting it all Out

    The PUA outside of Unicode

    • 1 Comments

    Colleague Aldo Donetti asked me:

    Hi Michael, I was investigating a bug and it turns out that this character ‘’(U+E843) is in the Private use range but it is also part of the Chinese 936 codepage.

    The issue is whether to consider characters in the private use area as valid characters in Identifiers (e.g. in VB/C#/WebService names/…) – I would not allow them but I’m not too familiar with that range so I’m double checking with you. At present it is not allowed (as weird as it may seem).

    Thanks!
    Aldo

    Now the Private Use Area is a part of Unicode that I have discussed before (ref: previous posts). In particular, I have talked about the relationship between the PUA and EUDC (End User Defined Characters) like in this post.

    But an important thing to keep in mind is that the PUA is not just a Unicode thing.

    In fact, all of the East Asian code pages have areas set aside for private use, and specifically intended for the kind of characters that EUDC is intended. The various ranges used (shown in the registry at HKLM\SYSTEM\CurrentControlSet\Control\Nls\CodePage\EUDCCodeRange) are:

    • 932 --- 0xF040-0xF9FC
    • 936 --- 0xAAA1-0xAFFE, 0xF8A1-0xFEFE, 0xA140-0xA7A0
    • 949 --- 0xC9A1-0xC9FE, 0xFEA1-0xFEFE
    • 950 --- 0xFA40-0xFEFE, 0x8E40-0xA0FE, 0x8140-0x8DFE, 0xC6A1-0xC8FE
    • Unicode --- U+E000-U+F8FF

    Looking at U+E843 (which is definitely in the Unicode PUA, covered in the defined range above) and its code page 936 mapping to 0xFE7E, it just kind of makes sense that the various ranges map to each other -- where else could they really map to if not to each other?

    But the behavior that does not allow them identifiers sounds like a very good one, that should not change. Because whether one is in the Unicode PUA or the PUA of a code page, one is not looking at good candidates for identifers....

     

    This post brought to you by(U+e843, a code value in the Unicode Private Use Area)

  • Sorting it all Out

    I'd stop as soon as possible, IIf you'd let me

    • 2 Comments

    I read Paul Vick's post IIF becomes If, and a true ternary operator and saw that a version of Visual Basic was finally going to have an IIF() that was going to properly short-circuit and not run both conditions.

    And I thought about how they cheated on this in Access.

    Well, technically not in Access, but in Jet. Or more specifically in the Jet Expression Service.

    That is the component that does all the work to run code in queries, and also to handle all of the various possible expressions one could put into controls on forms and reports.

    You ended up with that interesting (but subtle) difference in behavior where IIF() in a VBA function (that could be called from a query) could do something different than the identical IIF() code run from the query itself.

    Since the first time I had ever known people to request IIF() to be a ternary operator was in Access 2.0/EB, I guess that this was accomplished in fewer than fifteen years....

    And yes, it IS immediate if and not inline if. Wikipedia was wrong. Although I know lots of Access app developers who had the same misperception since they used it so commonly inside of properties to avoid writing separate VBA functions; this made it look like an "inline" operation so they called it that.

    Never mind all the names people had for the debug window. :-)

     

    This post brought to you by(U+2217, a.k.a. ASTERISK OPERATOR)

  • Sorting it all Out

    More significant than petty cache

    • 0 Comments

    Georg asked:

    One of my customers is enumerating fonts using the following code construct:

    foreach (FontFamily font in FontFamily.Families) {
            if (font.IsStyleAvailable(FontStyle.Regular))
              cbxFontName.Items.Add(font);
          }

    They are wondering regarding the performance of this enumeration. First run of the test app takes about 2 seconds, all consecutive runs take about 0.2 secons. I assume this has to do with some memory caching.

    Thanks,
    Georg

    Well, in a way it is caching, but no so much memory caching. In this case it is the one time hit (per session) of creating every single font.

    You can see a somewhat analogous hit if you use CultureInfo.GetCultures, and that bit of caching will only last for as long as the process is around since CultureInfo objects are not put in any kind of session level caching like fonts are....

    Although in this case memory caching may make it a little bit faster from time to time. :-)

    I should probably talk about the Window side of all this, but I will let that wait for an upcoming post....

     

    This post brought to you by(U+1038, a.k.a. MYANMAR SIGN VISARGA)

  • Sorting it all Out

    'Underage Thinking' is playing in the background

    • 0 Comments

    Absolutely nothing technical here! 

    When I tell people I enjoy listening to Teddy Geiger's Underage Thinking, people tend to look at me a little funny.

    And not just the people who have no idea who the hell Teddy Geiger is. Even the ones who know are a little confused.

    After all, he is hardly an angry female singer/songwriter (well, two out of four ain't too bad but it still seems against type).

    And I am also not a 17-year-old girl. I don't even know any 17 year olds at the moment.

    (I point this out since Teddy was in the cover of Seventeen last November!)

    But I carry on and point out his role of Wayne Jensen on Love Monkey, the show that should still be on the air.

    And if you take lyrics like this:

    I'm gonna muster every ounce of confidence I have
    For you I will
    You always want what you can't have
    But I've got to try
    I'm gonna muster every ounce of confidence I have
    For you I will
    For you I will

    It is obvious that he is very young and idealistic about love. Was I ever that young?

    Shit, I just realized why I am listening to him.

    Because I once was that young. And I once did feel that way, though I was not nearly as articulate about what I was feeling at the time.

    He has something that I lost some time over the last twenty years. Before I even knew it was something worth happening....

    Summary? As far a I can tell, Teddy Geiger is a more attractive, more articulate, more talented, and more successful version of the person I was back then. :-)

    (And also fall short of a Tom Ferrell type too, but that's another story!)

  • Sorting it all Out

    Honey, you are the [_tWin]Main source of joy in my life!

    • 1 Comments

    Human gjcoram asks:

    I'm developing a unicode e-mail client (nPOPuk) -- works in Win32, Win32Unicode, and various WinCE platforms. I was tracking down a memory leak that originated from the fact that "lpCmdLine" is a char* rather than a TCHAR* in the declaration of WinMain in winbase.h:

    WinMain(
        IN HINSTANCE hInstance,
        IN HINSTANCE hPrevInstance,
        IN LPSTR lpCmdLine,
        IN int nShowCmd
        );

    The original developer just blithely cast the LPSTR to a TCHAR*, and what was a nice "\0" was NOT the same as TEXT("\0").

    Shouldn't it be
        IN LPTSTR lpCmdLine
    ?

    Well, it isn't as simple as that, though at least you noticed the obvious bug in the original developer's code!

    If you look at the WinMain MSDN topic, it explains a bit about what is going on here:

    ANSI applications can use the lpCmdLine parameter of the WinMain function to access the command-line string, excluding the program name. Note that lpCmdLine uses the LPSTR data type instead of the LPTSTR data type. This means that WinMain cannot be used by Unicode programs. The GetCommandLineW function can be used to obtain the command line as a Unicode string. Some programming frameworks might provide an alternative entry point that provides a Unicode command line. For example, the Microsoft Visual Studio C++ complier uses the name wWinMain for the Unicode entry point.

    Another interesting tidbit can be found by looking at the Windows CE version of the WinMain topic, looking at the prototype in particular (note the piece highlighted in red!):

    int WINAPI WinMain(
      HINSTANCE hInstance,
      HINSTANCE hPrevInstance,
      LPWSTR lpCmdLine,
      int nShowCmd
    );

    There is even an old comment to one of Raymond Chen's blog posts that regular SIAO reader Mike Dimmick made that covers the CE issue and talks about what the original WinMain topic kind of hinted at -- that wWinMain is a Microsoft CRT illusion that may not work everywhere.

    For good measure, there is even that KB article that talks about how to fix when the illusion breaks (MSKB 125750)....

    But maybe with everything considered, GetCommandLineW is the best way to go here, though if you are using the CRT even the Routine Mappings page points out the existence of _tWinMain, which will go what gjcoram was originally thinking about....

     

    This post brought to you by (U+1588, a.k.a. CANADIAN SYLLABICS TLHO)

  • Sorting it all Out

    How to do more sorting and grouping by 9am than most ListViews do all day

    • 2 Comments

    Remember when I talked about Grouping and Sorting in a ListView, how I couldn't find a way to do it?

    Well, Bevan pointed out:

    Greetings from New Zealand.

    I found your post on ListView sorting at this link:

    http://blogs.msdn.com/michkap/archive/2006/03/06/544257.aspx

    Since comments are disabled, I thought I'd drop you this note - I've found a workaround to the issue:

    Step 1: Prior to applying the sort, remove the grouping by setting Group to null on each ListViewItem. As you do this, cache the grouping.

    Step 2: Sort the list normally

    Step 3: Restore the groupings cached earlier

    TaDa!

    Any chance you could post an update to the blog post so others can find the workaround?

    Keep Smiling,

    Bevan.

    I do get a lot of benefit out of that 90-day limit on comments; in fact, looking at the place where caught spam is found, I'd get even better results with a shorter time. But I like to give people the option for a bit....

    Maybe I never though I'd still be doing this after this long? :-)

    Anyway, cool idea Bevan. Though this method does not let one sort within the groups, it does allow one to put together a sort that will be applied to all of the groups. Not what I was hoping was lurking somewhere but good enough for most uses, I think. :-)

    I wonder if it can be relied on since it is not documented? Or maybe the fact that search through MSDN will find this post in a few days means it is documented in a weird kind of a way. :-)

    Ah well, a discussion for a future version when the behavior changes. For now, use with joyous caution....

    Thanks, Bevan!

     

    This post brought to you by (U+1e5c, a.k.a. LATIN CAPITAL LETTER R WITH DOT BELOW WITH MACRON)

  • Sorting it all Out

    Could it BE any clearer that it isn't the Limonata?

    • 3 Comments

    (Nothing technical here, I promise!) 

    Sometimes, the truth of a situation is subtle. It takes a whole lot of evidence to figure it out.

    And other times, it is so obvious that you wonder why you didn't notice it in the first place....

    Start with the fact that where I used to exercise (running) I stopped several years ago, around the time that I started using a cane. Okay, I guess that was about ten years ago.

    Add to that they I stopped walking for the most part much more recently.

    There is the fact that I am getting older. I can't make fun of my sister for being all old and such since I am nearly two years older than her. That much close to forty than she is, too.

    Factor in that my weight is the same as it was when I was 24. Despite the fact that I was in much better shape then. The muscle/fat balance has clearly been shifting, even though it is not yet shifting the scale. The pants are the same but some of them have a tiny bit of a muffin top thing going. Which if you don't know what I mean I am not going to explain.

    Then there is the cases of Limonata I just ordered. Aside from the odd extravagant meal I have, it is really the only thing different about my diet. I probably eat closer to healthy now than any other time since I was 18.

    And finally, there is that Slimquick commercial. You know, the one with the cartoon figure of the woman and the man are there and she is saying

    My husband quit drinking soda and lost 12 pounds I quit and lost one.

    My husband quit eating subs and lost 15 pounds, and I quit eating bread 2 years ago!

    My husband took up jogging....

    Now enough already, what do you want to weigh ZERO?!?

    You know that one?

    So, to sum up....

    I could blame the age.

    Or the lack of exercise.

    Or the recent even less exercise since I have the scooter.

    Or could stopping the Pro Club membership be an overt act that would affect anything even though I stopped going six months before that? Why not?

    Or I could blame the Limonata.

    Frankly, I've decided that although The Nile is a river in Egypt that it has a tributary flowing through Redmond, WA. And I will stay by the banks of that river until I gain 12 pounds, like the cartoon husband in the Slimquick commercial. Because he is smiling when she is all mad at him. And because I love the Limonata, and I'd rather just blame some combination of the age thing and the [lack of] exercise thing, since I could try and do something about the latter even if the former is outside of my control....

    There is an off chance the exercise might help fix the problem even if it really is due to Limonata.

    But it isn't.

    I'm really sure that it isn't.

    Stop looking at the stack of cases of Limonata, dammit!

    It is not the Limonata.

     

    This post brought to you by(U+2635, a.k.a. TRIGRAM FOR WATER)

  • Sorting it all Out

    Is it a cliché to say something like 'We're hiring!' ? :-)

    • 7 Comments

    You can see the image over on the right side of the blog -- just under my brain and just over the archives. It has a link to the current open posting across the entire organization (full size about 400 or so people). The positioning is an accident of ease of placement; none of the jobs report directly to me or my brain. :-)

    If you end up interviewing for one of these positions and they let you stop by and say hello then be sure and do so -- I  very proud of the people who have come to work at Microsoft who were in whole or part originally inspired by Sorting It All Out!

    Special thanks to Jennifer Shepherd (the design genius who created the updated MSKLC 1.4 art/icons) for the cool image that you can see over on the right side of the blog -- I could have put it inside a collapsible category, but it looks so good that I'll leave it out in the open for now.

     

    This post brought to you by (U+2f2f, a.k.a. KANGXI RADICAL WORK)

  • Sorting it all Out

    Getting the language of an LCID-less keyboard

    • 3 Comments

    Tom, a reasonably senior developer, asks (product names removed for no particular reason):

    Can users of publicly available keyboard layout creator make a layout that is not tied to a specific LCID?  In other words, could the keyboard layout language be a custom culture with no defined LCID?  If so, how can applications detect the keyboard language?

    ... [And if so] how does a Win32 app know the language of the keyboard?  󰀀󰀁󰀂󰀃󰀄󰀅 generally relies on Win32 APIs to get the current keyboard's LCID, and to enumerate the LCIDs of the user's list of installed keyboards.  What would you suggest 󰀀󰀁󰀂󰀃󰀄󰀅 do to surface the benefit of these LCID-less keyboards?

    Of course one of the principal features of MSKLC 1.4 is support of custom locales on Vista, so clearly it is possible to create a keyboard layout that is not tied to a specific language LCID.

    Though the desire to know something about the language is a valid one, in any case. And there is a way to get it. :-)

    In the registry, in a subkey under that key I mentioned in Maybe it was registry rumination and How can be changed the keyboard layout name label?, a special registry value was added to go with any custom language that came from a custom locale, the value name of which is Layout Locale Name and it will contain the LOCALE_SNAME value of the locale that was selected by the keyboard author for the association.

    That registry value will contain the value you need (like en-US but for the custom locale that the user chose.This will be true on any machine, whether the custom culture exists or not.

    (the KLID that will be used in this case is in the form ####0c00, the reasons for which will make another post at some future time so interesting!)

    This will let Tom (or anyone) who is retrieving locale information have the opportunity to find out what locale has been associated with.

    In an upcoming post I'll talk about how to find a keyboard layout's registry subkey after it was created with MSKLC (which may be an important first step if you need a specific one!)....

     

    This post brought to you by 󰀀 (U+f0000, the first Plane 15 character)

  • Sorting it all Out

    MS Word says PUA is EA/CJK? TISNF!

    • 1 Comments

    Parag asks:

    Hi,

    I am a font developer from India.

    I have designed a multi-byte using private area of Unicode Range, and also a keyboard handler through which I want to use this multi-byte font in various applications.

    My problem is, when I run my keyboard handler, set my font and try to type in MS Word, the font name automatically changes to MS Mincho and the characters are seen in Chinese.

    Surprisingly, if I select the text and apply my font to it. I could see my language (Bengali) properly.

    But why initially the font is getting changed to MS Mincho? Do I need to do somthing in intl.inf?

    If the first character I type is space, font does not changes and I could properly type Bengali.

    Please help...... I am frustrated.....

    Regards,

    Parag

    It is hard to say without knowing exactly what parts of the Unicode Private Use Area are being used, but once I remind about the whole PUA? P.U. ! issue, I'll just add to the whole PUA/symbol issue by pointing out how apps like Word will assume that PUA in running text is actually being taken as if it was East Asian EUDC.

    You might be able to get away with this not happening if you do language tagging so that Word does not think it is Chinese. Though don't try to make it Bengali or it will try to do all sorts of other things, or just not even be willing to let it get tagged that way.

    Or maybe just give up on the PUA for this kind of thing when you use Word -- it really does seem like a "Unicode or bust" scenario....

    There is some great information about developing Bengali OpenType fonts here, though. Highly recommended, rather than going down the PUA route. :-)

     

    This post brought to you by (U+09a8, a.k.a. BENGALI LETTER NA)

Page 1 of 4 (50 items) 1234