Blog - Title

November, 2005

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!
  • Sorting it all Out

    Not sure if I am blue, but the Novantrone is!

    • 2 Comments

    Just a quick M.S. update for the curious....

    I just had my second Novantrone infusion this morning, but I am starting to doubt that thing about the whites of the eyes getting bluish. I think they tell people that since it encourages folks like me to sign up for being poisoned every three months. :-)

    But the infusion went well once it started.

    It started a little late because the order never made it in front of my neurologist and the nurses in Special Procedures will not do the infusion without a signed order (cannot be faxed). On top of that, the pharmacy will not dispense the medication to Special Procedures without such an order, either.

    But my neuro came in herself and we chatted for a bit as she filled out the form, and the NP came down and gave me money for parking. So all was good.

    I'll let you know how I am feeling in a few days. In the meantime, I am a conditional fan of the whole Novantrone thing.... :-)

  • Sorting it all Out

    Expectations around collation

    • 8 Comments

    Back many years ago when I first moved to Columbus, OH (the place I lived before Redmond, WA) I owned a cat named Kim (short for Kimberly Cleopatra).

    Well, I didn't really own Kim; no one ever really owns a cat. I think she pretty much owned me. :-)

    Anyway, she injured herself and the veternarian had to put one of those collars on her for a few days. His parting advice was to have Kim take it easy for a few days.

    It was at first kind of amusing to me, thinking that he must not own a cat if he thinks such things are under the control of the "owner" of the animal.

    But my thought later was wondering how an animal who sleeps 21 hours a day can actually "take it easy". Since that appears to be her job, her nature, what she just naturally does.

    Kim has passed on now (she was quite a few years old when I inherited her)....

    Anyway, assuming you are still reading I will explain where I am going with this.

    There is an interesting side conversation going on in the comments of posts like the one I wrote about the Hungarian technical sort or about whether a comes before A. The interesting part is people talking bout the various expectations of different kinds of users, who may or may nor be in different locations or speaking different languages or performing different tasks.

    I found it quite amusing as people from different viewpoints added their opinions about what was the expectation, which seemed to change depending on all of those different factors.

    And I was in a meeting yesterday where I found myself amused again as people in the meeting talked about the expectations of users who may prefer English sorting versus German sorting versus Dutch sorting.

    It was at first kind of amusing to me since all three of those languages use the default collation table, so it seemed funny to talk about conflicting expectations when all three would sort the same way on Windows. :-)

    But my thought later was wondering how a feature that the majority of users cannot describe the rules of effectively could engender so many discussions!

    It is just the natural expectations that people expect Windows to do for them.

    And as I watched the uncontrollable force of the people in the meeting talking about the people who use Windows, as I thought about the readers of this blog who commented about their various (sometimes conflicting) expectations that they have around collation which are also quite uncontrollable, I can't help remembering Kimberly Cleopatra and hoping she has found her door into summer.

    After all, her expectations pretty much represented "the way things are supposed to be", too.

    Maybe that is why i like collation so much -- its cat-like-inspiring qualities? :-)

     

    This post brought to you by "Ѥ" (U+0464, a.k.a. CYRILLIC CAPITAL LETTER IOTIFIED E)

  • Sorting it all Out

    Oops, we did it again

    • 25 Comments

    I could not resist the Brittney Spears reference, sorry!

    But there is yet another Language Interface Pack, this one for Nepali!

    A tiny little bit of info about the Nepali language:

    Nepali (sometimes also referred to as "Nepalese") is the official language of Nepal where it is spoken by roughly half the population as a mother tongue and by about 2 million people as a second language. It is related to Hindi but has borrowed less words from Persian and English (instead using more Sanskrit derivations) and it has been influenced by the neighboring Tibeto-Burman languages.

    Very cool!

     

    This post brought to you by "" (U+0928, a.k.a. DEVANAGARI LETTER NA)

  • Sorting it all Out

    SQL Server's cultural sensitivities

    • 0 Comments

    Souri asked me:

    Michael,

    Does SQLServer 2005 support culture aware Sorting?

    Thanks,
    Souri

    Hi Souri!

    Not only does SQL Server 2005 do this, but SQL Server 2000 does as well; even SQL Server 7.0 supports a limited version of it.

    But the word 'culture' in your question does seem to imply a connection to the .NET Framework, and just in case the question being asked of me was in reference to whether SQL Server 2005 supports the same collations as the .NET Framework, you will want to take a look at posts of mine like String.Compare is for sissies (not for people who want SQLCLR consistency) and Real developers use CompareInfo's Compare (Part 1) to see some of the things you need to do to make SQL Server and the CLR more compatible....

    Note that these techniques will get you close but will not give you perfectly consistent results -- there is no way to make the match 100%.

     

    This post brought to you by "Ǿ" (U+01fe, a.k.a. LATIN CAPITAL LETTER O WITH STROKE AND ACUTE)

  • Sorting it all Out

    Custom keyboard, custom language?

    • 9 Comments

    Madhava Tennakoon asked me:

    I am a developper of fonts for Sinhala/Sinhalese Language - Sri Lanka. Those days we developped fonts for Sri Lankan Standerd like ANSI (256 Chrs). After the Unicode standerd came now there is a page for Sinhala(Sinhalese) U+0D80. So I Developped new font using that page and glips.

    Then there is a problum. Microsoft deos not have keyboard for Sri Lankan Sinhala Language.

    So I downloaded the program named Microsoft Keyboard Layout Creator (MSKLC.EXE) and tried to create a Sinhala Keyboard Layout. In This case I can develop a keyboard but in that pakage asks for a "Language" in its properties dialog. But .. It (Microsoft) Deos note have a language as Sinhala or Sinhalese for Our Language.... and it has a LCID for Sinhalese but there is no way to use that.

    Please help me to solve this problum.
    If there is a place to download our language from somewere or....
    If there is a way to create a new language.

    Please help me....

    Thanks.

    Madhava Tennakoon
    Sri Lanka

    This hits on the biggest limitation that exists for MSKLC: the fact that it is limited to the list of locales that shipped with the 1.1 version of the .NET Framework.

    There are at least four things that make this worse:

    • The expanded list of languages in ELK v.1 and ELK v.2
    • The Sinhalese language enabling work done after the Tsunami (which also gave it an LCID --  0x045b)
    • The even more expanded list of languages that will be happening in Vista
    • Custom culture support in the .NET Framework 2.0

    Now one thing I will mention is that it is one of the most requested items on the feature list for the next version of MSKLC.

    In the meantime, the list of languages is static, even though you can cover any language you want with the keyboard itself.

    Slightly embarrassing given the fact that I not only developed MSKLC but was involved to some extent with every one of those bulleted items above. Kind of like engineering to make myself look foolish? :-)

    Sorry about that, and it will get better next time around....

     

    This post brought to you by "" (U+0d88, a.k.a. SINHALA LETTER AEEYANNA)

  • Sorting it all Out

    See Lewis Black at the Paramount on December 3rd

    • 0 Comments

    For all those who thought I had no political opinions? :-)

    Lewis Black, one of the most awesome forces of nature to hit The Daily Show since John Stewart, is going to be at The Paramount on December 3rd, a show definitely worth seeing!

    More info at foolproof.org, right here.

    There is even an after hours in the Paramount lounge after the show (more info here).

    Definitely a worthwhile show, if you are going let me know and we can hook up or whatever. :-)

  • Sorting it all Out

    Technically it *is* a hungarian sort

    • 11 Comments

    Back in August in the post Double compressions -- Hungarian goulash? I described how double compressions worked in Windows and the .NET Framework.

    And then a week ago in Hungarian is even more complicated than I thought I talked about an additional interesting wrinkle in this particular language's collation.

    There were some interesting comments in that post, like this one:

    I tell you a story. I had a strange error on MS SQL Server. select ... where [Product Identifier] = '%SG%' did no find the product with the identifier of "KCSG01"

    A friend suggested that maybe it treats "cs" as one letter. I said impossible, even MS can't be so crazy. And he was right - after setting collation to binary it worked!

    I it is completely amazing - who wanted this this feature? Who needs it? Why did it have to be developed and hardcoded into Windows/MS SQL? I agree that a grammatical analyser function library might sometimes useful to someone, but to hardcode it right into the OS!... Why?

    When users search for "ddzs", they don't want to find "dzsdzs" - they are searching for LETTERS, you know, they don't want to keep all these grammatical rules in their heads. No one expects that their search input will be grammatically analysed!

    So why has this feature been implemented?

    To which I thought about the fact the Hungarian Technical Sort exists as an alternate sort for Hungarian (its LCID is 0x1040e). This sort has several characteristics that distinguish it from the standard Hungarian sort (0x040e):

    • None of the compressions that I have talked about previously
    • None of those Hungarian double compressions, either
    • The uppercase letters come before the lowercase ones, unlike most other language collations on Microsoft products

    There is, in fact, nothing uniquely Hungarian about it and anyone who was wanting the uppercase/lowercase thing reversed might be happy with the ordering.

    The perfect answer for those more technical situations when one does not want to be bogged down by those linguistic collation details, right? :-)

     

    This post brought to you by "ʥ" (U+02a5, a.k.a. LATIN SMALL LETTER DZ DIGRAPH WITH CURL)

  • Sorting it all Out

    Getting all of the alternate sorts for a locale

    • 2 Comments

    Recently, Eman asked in the newsgroups:

    Please help me clear up general LCID logic.

    As far as i understand the SORT_xx values make sense in a pair with PrimaryLanguage only. In other words, a Sort ID has the same meaning for a set of LCIDs where PrimaryLanguage is the same but SubLanguages differ. Is this correct?

    I had to be the bearer of bad news, unfortunately:

    No, that is not the case.

    Sort ID's mske sense with specific LANGID values for which there are supported sorts. It has nothing to do with primary language ID alone.

    And this is an important distinction -- for example, 0x0407 is German - Germany, and 0x0807 is German - Switzerland. You can get them by using:

    MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN)

    MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN_SWISS)

    but if you wanted to use German phone book alternate sort, then only the following will work:

    MAKELCID(MAKELANGID(LANG_GERMAN, SUBLANG_GERMAN), SORT_GERMAN_PHONE_BOOK)

    because it is not defined for any of the other German language locales.

    Eman then asked a follow-up question:

    Thank you for answer.

    So, to let the user choose a sort within specific LangID i can EnumSystemLocales with LCID_INSTALLED | LCID_ALTERNATE_SORTS, filtering out items with the LangID.

    Please correct if i am wrong again.

    Now this answer is not wrong, but it is a lot more work than is needed. To get the list of alternate sorts available within a specific LANGID there are two methods:

    1. Call EnumSystemLocales with the LCID_ALTERNATE_SORTS flag, and then filter for all of the ones where LANGIDFROMLCID returns the LANGID you are looking for.
    2. Call IsValidLocale for every LCID value of MAKLCID(langid, n) where n is from 0x1 to 0xf, and wny time it is valid then you know you have found an alternate sort for the langid.

    Which method to choose? Well at the moment method #1 may perhaps seem faster since there are way fewer than 15 alternate sorts across all locales, but method #1 returns strings and converting all of those strings to numbers so you can compare them in the locale macros may be more expensive than comparing 15 numbers. Both operations are fast enough that it is one of those silly interview questions to make a candidate answer as to which method is better.

    Personally, I would choose method #2 since the future could have more alternate sorts in it (one never knows!) and I like to avoid dealing with strings when I can. But if that were an interview question I were given, I would be sure not to take that particular job.... :-)

     

    This post brought to you by "Ɣ" (U+0194, a.k.a. LATIN CAPITAL LETTER GAMMA)

  • Sorting it all Out

    And you can't set all of the properties all of the time...

    • 2 Comments

    There is some sort of implicit belief that if a locale property is settable that it should be set.

    Up to and including the .NET Framework 2.0, there are a few of the properties that violate that belief in the NumberFormatInfo class and that should not be set:

    I should probably back up a second and explain when they are meant to be set, and more importantly when setting them will not actually accomplish anything.

    These four properties can be set any time you are using a NumberFormatInfo object to do work. The first three will affect percent formatting and parsing, while the last one will do very little at all.

    These four properties should not be set on the NumberFormat hanging off the CultureAndRegionInfoBuilder class.

    Well, you can set them if you want. But the values you set will not be persisted in the custom culture. So setting them when you are creating a custom culture will not actually accomplish anything. So why spend the time?

    The rules whereby the values are set vary for each property:

  • PercentGroupSizes -- takes the value from NumberGroupSizes
  • PercentDecimalDigits -- takes the value from NumberDecimalDigits
  • PercentGroupSeparator -- takes the value from NumberGroupSeparator
  • PerMilleSymbol -- always starts as U+2030

    The reasons? Well, I am not sure who uses the PerMilleSymbol so I do not know for sure, but I would guess that the reason is the same as the other three properties: that there is no locale/culture that we know of that reports a different value for these properties.

    Until/unless that changes, these are four properties that you should not try to set in your custom cultures....

     

    This post brought to you by "‰" (U+2030, a.k.a. PER MILLE SIGN)

  • Sorting it all Out

    CultureInfo thoughts

    • 0 Comments

    Brad Abrams posted about The SLAR (vol2) on System.Globalization.CultureInfo and included quotes from members of the GIFT team (Shawn Steele and Ihab Abdelhalim).

    One bit in particular stood out in my mind:

    Except for Invariant, don’t expect the culture data provided by this class to remain the same, even between instances running as the same user.

    Now everyone knows one time that may or may not be true, right?

    Bonus points for the first correct answer....

     

  • Sorting it all Out

    100% roundtrip ASCII? 100% roundtrip ANSI?

    • 15 Comments

    Back in January I was talking about the new compiler error C4819 and how the compiler detected invalid characters.

    And anyone who has been reading here knows that the reverse solidus is always the path separator, even when it looks like a yen or a won.

    So among the so-called 'ANSI' code pages, ASCII (0x00 - 0x7f) will roundtrip 100% of the time.

    How many "invalid" slots are there in the 'ANSI' code pages in the 0x80 - 0xff range, exactly?

    Let's take a look at the Windows code pages:

    There you have it. Code page 1256 is the only one that is guaranteed to be able to roundtrip every single code point without losing any of the bytes....

     

    This post brought to you by "¿" (U+00bf, INVERTED QUESTION MARK)

  • Sorting it all Out

    More on the fabled EqualString

    • 22 Comments

    If you go back all the way to April and look at some of the comments to the What the %#$* is wrong with German sorting? post you will see one of those times that the fact that we have to equate sorting and comparison can confuse people about the results.

    And about a week ago in the Hungarian is even more complicated than I thought post, I explained why the lack of a companion function to CompareString that tested for equality kept a particular (uncommon?) feature in Hungarian collation from being supported.

    But the thing that has generated the most email was neither the issues related to German nor the issues related to Hungarian; it was my contention that the difference between CompareString and a theoretical EqualString function was orders of magnitiude greater than the difference between RtlCompareUnicodeString and an RtlEqualUnicodeString.

    A bunch of people wanted to understand better why the performance difference is so... well... different.

    And a bunch of other people want to know why, if the Rtl* difference is so near to inconsequential yet they get the two functions why the NLS side of the world would not get two functions if the performance difference is so bleeding consequential.

    Two different, excellent questions in there (and I did retain the most amusing of the wordings of the two questions). To answer the first question, I will explain further the difference between the two scenarios (the binary/ordinal vs. the linguistic).

    • RtlCompareUnicodeString has a simple job -- take two strings and return 0 if they are equal, < 0 if string1 is less than string2, and > 0 if it is greater, in a binary/ordinal sense. Of course this is an ordering that will be unsatisfying to any typical user, and I would suspect most of the atypical ones.
    • RtlEqualUnicodeString has an even simpler job -- take two strings and return TRUE if they are equal and FALSE if they are not, in a binary/ordinal sense.

    Now asking a candidate to write the ideal implementations for each of these functions and then with both of them on the board going through the expected differences in performance between them could make for a mildly interesting Microsoft interview question. But beyond that, the difference is not all that significant (in that post last week I hinted at this and the fastest underlying implementations when I mentioned that "the difference of two numbers not being zero" and "two numbers not being equal" was really not going to be all that significant).

    The key issue here is that these functions are looking through the two strings. As soon as they note any difference of any kind, they can return the result. Period.

    The world of the less lexicographic and more linguistic CompareString and mythical EqualString functions is a very different one. Because in this world those linguistic weight tables are used.

    This is hardest on CompareString where with those weights there are so many different levels of difference, with some levels trumping other levels. Therefore, any difference between the two strings is an interesting bit of trivia unless no greater difference is found later on. Those lower levels include all of the potentially ignorable distinctions in e.g. case, diacritic, width, or kana. Unless a Unicode weight difference is found (which lets CompareString return more quickly), it will have to enumerate the full strings, storing up those lesser differences in case it needs them.

    And this is where the fictional EqualString function would have it easier -- because like its binary/ordinal cousins it would be able to return after any difference is found, of any weight. This is (potentially) hugely faster for so many of the potential strings that CompareString has to test for.

    This should answer the first question (why the performance difference is so... well... different).

    The second question (if the Rtl* difference is so near to inconsequential yet they get the two functions why the NLS side of the world would not get two functions if the performance difference is so bleeding consequential) is a bit more complicated.

    There would be essentially two good reasons to consider adding such a function to a future version of Windows:

    • The need to have separate comparison/sorting operations to provide appropriate linguistic support;
    • The need to have such an absolute linguistic equality operation done frequently enough that the perfomance difference would be significant in implementations.

    The linguistic support argument is a tough one for two reasons:

    • Existing implementations will by now be depending on the fact that we combine the operations of comparison and sorting, so we could not remove one functionality without breaking callers.
    • In most cases, there really is no difference that would confuse anybody -- in fact to date I do not know of any complaint beyond the ones that I directly inspired by pointing out the differences to native speakers of particular languages either to directly answer the question or in this blog! This suggests that the difference is more theoretical in the real world of what people would be sorting/comparing on computers.

    The performace question is harder to discount, although it significant to note that good implementations that are building indexes would be using the LCMapString function with LCMAP_SORTKEY flag to build sort keys. Obviously such sort keys are numbers already and allow both comparisons and sorts to be done even more quickly than the imaginary EqualString function could do its work. And it is hard to argue that "Microsoft must add something fast" if the person asking is actually not using something faster already....

    In fact, sort keys bring the linguistic comparison down to the level of the binary/ordinal world where RtlCompareUnicodeString and RtlEqualUnicodeString do their work. And in using sort keys the difference between the two different oprerations of comparison and sorting is on the same order of magnitude as between the binary/ordinal functions!

    The other problem is that in the small number of scenarios where sort keys are not a practical solution, a linguistic comparison is not appriopriate (e.g. filenames and other symbolic identifiers). That would mean that converting the unreal EqualString into a real NLS API would actually encourage bad usage and bad development practice!

    (I will talk more about this issue with symbolic identifiers another day, and also it will cone up in an upcoming presentation at the Internationalization & Unicode Conference on March 6-8 entitled 'Tales of Incorrect String Comparisons'.)

    It may be of interest to some that the new in Vista FindNLSString function probably has more in common with the fairy-tale EqualString than it does with CompareString since it too pays no attention to non-ignored differences other than to keep looking for the substring. However, the fact that FindNLSString is also looking for the location within the substring causes it to be slower to answer a different question; the mythological EqualString function would still be faster.

    (I will talk more about this issue with the comparison and contrast of FindNLSString, CompareString, and LCMapString with LCMAP_SORTKEY flag another day!)

    But irregardless of whether it had been able to justify its future inclusion, the arguments for and against the invented1 EqualString are certainly worthy of a blog post!

     

    1 - Special thanks to the Microsoft Word 2003 Thesaurus for its support here in finding appropriate words to use for the unreal EqualString function!

    This post brought to you by "ß" (U+00df, a.k.a. LATIN SMALL LETTER SHARP S)
    (A letter that was overheard in the character locker room after reviewing this post muttering "Curses! Folied again...")

  • Sorting it all Out

    29th Internationalization and Unicode Conference

    • 3 Comments

    That is right, the 29th Internationalization and Unicode Conference will be going on soon, and you can be there, too! The conference will be happening on March 6-8, 2006 in Burlingame, CA, USA.

    The Unicode Consortium press release is right here.

    And the initial agenda for the conference is right here.

    There are two presentations I will be doing at the conference:

    Though admittedly I may be lobbying them to reverse the ordering (they have the intro talk second rather than first!). I'll talk further on this as I know something further....

    Both talks should be pretty cool, as there is a lot of new information in both of them that can affect security, performance, proper internationalization, work in Whidbey, Vista, prior versions, and more!

    I'll be talking more about these talks and related issues as the conference start date draws nearer....

  • Sorting it all Out

    What does 'accessible' mean? Is U2 accessible?

    • 10 Comments

    Or, more properly, is the hotel owned by (among others) U2, The Clarence Hotel in Dublin city center accessible?

    Most properly is the simple question "is there an accessible entrance to the hotel restaurant?"

    This was the question that was posed to the doorman of The Clarence Hotel as Chris Kiernan and I pulled up and we were admiring the flight of stairs and lack of visible ramps or elevators, heading to the dinner party celebrating colleague Adrian's graduation.

    The doorman, looked at us, puzzled. I then added, realizing we were in the midst of another Screw language; it's about the dialect, baby! moments, added "for a wheelchair."

    The wheelchair thing was spur of the moment, but I realized that I did know precisely how to refer to the scooter in Ireland. I don't think I had even seen one other than mine on any of my visits here!

    The doorman was no longer puzzled at what we wanted, but he was clearly unsure about the answer. He offered an "I don't think so. (pause) Let me check." and he was off into the hotel for a few minutes.

    Chris and I, while waiting, discussed the fact that the term "accessible entrance" was not very well known here, and had a small laugh about the fact that the term to use was simply not known to Chris, either.

    Anyway, a few minutes later the doorman came out and apologetically said that no, there was no accessible entrance. Though we could of course  check the entrance on the other side if we wanted (he had probably been gone long enough to have walked to the other side and back, so our chances seemed slim, but we still had some hope).

    As it turns out, calling this person a "doorman" was probably inaccurate, he must have just been a big fan of Jim Morrison with an official title of Doorsman. Because the door on the other side of the building was completely accessible and we made it straight in to the restaurant!

    But in any case we had a wonderful dinner and went to a bar after that where I had my first pint of Guinness, an entirely non-memorable experience (though Chris tried a bit and was not pleased either; he suggested it had not aged properly. Whatever that means!).

    All in all, very nice evening with some great conversations, and a chance to see Anna again (she had just moved from Redmond to Dublin).

    And I am pleased to report that The Clarence Hotel in Dublin city center is quite accessible, even if the same cannot be said about all of the staff at the door....

  • Sorting it all Out

    Private fonts: for members only

    • 11 Comments

    About a month ago, I had someone ask me about the images that appeared in MSKLC for characters that had no visible representation, wondering whether they were part of a font or not. After I explained that they were, she asked me how this font was able to be used without it being on the machine (the font itself was not all that interesting to her, but the capability to do that with any font was).

    Then this last week, Rick Engle was asking on an internal Microsoft distribution list whether a font could be included as a project resource.

    So today I thought I'd take these two items and handle them both at once, since they are really the same item.

    The answer is that yes you can do this, and it works quite well using:

    The line between managed and unmanaged code can blur when you use managed code like WinForms, since some of the controls are unmanaged underneath (like the RichTextBox) and require you to use both techniques at the same time to help things work properly. There are several controls that fit into this category, so I generally find it better to be safe than sorry and just always include them both.

    And it is especially important to use both techniques in Whidbey (VS 2005) since GDI/Uniscribe is used in many cases rather than GDI+, as I have discussed previously.

    Also, be sure that you are following the licensing rules and restrictions for any font you wish to include!

    Basically, start by adding the font file to your project, right-clicking on the file in the Project Explorer, and choosing 'Embedded Resource.' The name you will load is the font file with a fully qualified name (project namespace included), e.g. WindowsApplication1.MyPrivateFont.ttf.

    Here are in namespaces we will need:

    using System.Reflection;
    using System.Drawing;
    using System.Drawing.Text;
    using System.IO;
    using System.Runtime.InteropServices;

    And here is the basic code:

    // Adding a private font (Win2000 and later)
    [DllImport("gdi32.dll", ExactSpelling=true)]
    private static extern IntPtr AddFontMemResourceEx(byte[] pbFont, int cbFont, IntPtr pdv, out uint pcFonts);

    // Cleanup of a private font (Win2000 and later)
    [DllImport("gdi32.dll", ExactSpelling=true)]
    internal static extern bool RemoveFontMemResourceEx(IntPtr fh);

    // Some private holders of font information we are loading
    static private IntPtr m_fh = IntPtr.Zero;
    static private PrivateFontCollection m_pfc = null;

    /////////////////////////////////////
    //
    // The GetSpecialFont procedure takes a size and
    // create a font of that size using the hardcoded
    // special font name it knows about.
    //
    /////////////////////////////////////
    public Font GetSpecialFont(float size) {

       Font fnt = null;

       if(null == m_pfc) {

          // First load the font as a memory stream
          Stream stmFont = Assembly.GetExecutingAssembly().GetManifestResourceStream(
                                  "WindowsApplication1.MyPrivateFont.ttf");

          if(null != stmFont) {

             // 
             // GDI+ wants a pointer to memory, GDI wants the memory.
             // We will make them both happy.
             //

             // First read the font into a buffer
             byte[] rgbyt = new Byte[stmFont.Length];
             stmFont.Read(rgbyt, 0, rgbyt.Length);

             // Then do the unmanaged font (Windows 2000 and later)
             // The reason this works is that GDI+ will create a font object for
             // controls like the RichTextBox and this call will make sure that GDI
             // recognizes the font name, later.

             uint cFonts;
             AddFontMemResourceEx(rgbyt, rgbyt.Length, IntPtr.Zero, ref cFonts);

             // Now do the managed font
             IntPtr pbyt = Marshal.AllocCoTaskMem(rgbyt.Length);
             if(null != pbyt) {
                Marshal.Copy(rgbyt, 0, pbyt, rgbyt.Length);
                m_pfc = new PrivateFontCollection();
                m_pfc.AddMemoryFont(pbyt, rgbyt.Length);
                Marshal.FreeCoTaskMem(pbyt);
             }
          }
       }

       if(m_pfc.Families.Length > 0) {
          // Handy how one of the Font constructors takes a
          // FontFamily object, huh? :-)

          fnt = new Font(m_pfc.Families[0], size);
       }

       return fnt;
    }

    Now this code is assuming you are only ever adding one font but plan to ask for different sizes of it; if you need to expand it to use multiple fonts then you will obviously want to take that PrivateFontCollection object and that IntPtr for the unmanaged font and convert them to be a collection or something to hold more than one font, perhaps using a Hashtable that index by the name?

    Although I was tempted to make this procedure more generic, the requirement for the font to be an embedded resource and the fact that I am technically on vacation made this seem a little less critical for a quick sample. People who are interested can obviously do what they need to here, of course!

    For cleanup, you will eventually want to dispose of your PrivateFontCollection and call RemoveFontMemResource, though for the latter the documentation claims that when the process goes away, the system will unload the fonts even if the process did not call RemoveFontMemResource (so you can base whether to call that on how clean you want your code to be!).

    Those are the basics and should be enough to get you started....

    As I think about the way that private fonts are implemented in both GDI and GDI+, a possible model for a future version of a "private CultureInfo" seems to emerge.

    It is obvious that if you are creating a CultureInfo that can be enumerated and used by everyone on the machine that you should have administrative permissions. But perhaps this model could be used to create a more private CultureInfo for use in a particular application. Just as with the private fonts you could not enumerate them or use them systemwide, but there already is a mechanism to get those things done. Is it something you could use in an application yourself.

    Seems like an interesting idea, right?

     

    This post brought to you by "" (U+2030, a.k.a. PER MILLE SIGN)
    (A character that was quite impressed to see that I managed to add something 'international' at the end of the post!)

  • Page 1 of 4 (55 items) 1234