Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
Another one of those "new in Vista" posts. :-)
Unicode has added a great deal to support mathematics, from Unicode Technical Report 25 (Unicode Support for Mathematics) to the various Mathematical subranges in Unicode (see the Mathematical Symbols column in the Code Charts for Symbols and Punctuation).
My favorite range is the Mathematical Alphanumeric Symbols block in Unicode, which currently has all of the characters from U+1d400 to U+1d7ff (almost 1000 in all, with some spaces that were left in, as you can see from the code chart).
Why is it my favorite?
Well, I was having a conversation a few years back with Murray Sargent of Microsoft (one of the representatives of MS at Unicode Technical Committee meetings and a co-author of UTR #25). He was explaining why Unicode, which is generally speaking a plain text standard, was going to approve a block of characters that included many different letters and numbers with bold, italic, and other variations usually reserved for "rich text" outside the scope of Unicode.
"It is all about mathematics, and representing it in plain text," he explained. And he has a point; while I may use bold or italicized text for emphasis, in mathematics there is actual semantic meaning that is expressed in symbols an variables that have such attributes.
At that point, thinking about collation, I asked him if there was ever a time that it would be interesting or important to fold those differences together, for all of the following:
At first, Murray thought I was trying to make them all equal, and objected strenuously to that; luckily I had something different in mind. I pointed out some scenarios:
And he definitely saw the benefit to such a collation.
So, after this conversation (and a few others with other various math experts), in Vista a special LCID is being added:
0x0001007f (MAKELCID(MAKELANGID(LANG_INVARIANT, SUBLANG_NEUTRAL), SORT_INVARIANT_MATH))
It is an alternate sort for the invariant locale, because mathematics is independent of specific locale (kind of like invariant is!).
This locale causes each of the above letters to be a mere secondary and/or tertiary difference away from everything else on the list. The same principles were applied to all of the Greek letters and numbers in the block.
Please note that this is not something that can be selected in Regional and Language Options as a locale (neither can invariant, so obviously an alternate sort of invariant cannot be chosen). But it can be used in any programmatic situation where one is looking to compare strings, find within strings, or create sort keys.
And it is right there in Vista, for those who are mathematically inclined....
This post brought to you by "๐" (U+1d400, a.k.a. MATHEMATICAL BOLD CAPITAL A)