Postings are provided as is with no warranties, and confer no rights. Opinions expressed here are my own delusions; my employers at best shake their heads and sigh, at worst repudiate the content with extreme prejudice, whenever it manages to appear on their radar.
This blog is unsuitable for overly sensitive persons with low self-esteem and/or no sense of humour. Proceed at your own risk. Use as directed. Do not spray directly into eyes. Caution: filling may be hot. Do not give to children under 60 years of age. Not labeled for individual sale. Do not read 'natas teews ym' backwards. Objects in mirror are closer than they appear. Chew before swallowing. Do not bend, fold, spindle or mutilate. Do not take orally unless directed by a physician. Remove baby before folding stroller. Not for use on unexplained calf pain.
A nice FLAIR (FLuid Attenuated Inversion Recovery) view from the not-too-distant past. Every abnormality you can see on this scan (and there is more than one!) is asymptomatic at present. Alongside is a picture of me walking the walls at Fremont Studios, a sign of a damaged brain.
By necessity, my blogs are often about something on the micro scale -- one customer report, one phenomenon that interests me, one event, one bug, one concert, one wheelchair, one function. Even the occasional groupings of these things are quite small.
And then there are trends. Thus Why do we call w 'double u' -- doesn't it look more like a 'double v'?, where I talked about the Swedish Academy's change to the way the letters W and V were to be handled in collation, and the impact on Microsoft software when this change eventually makes it to the point where it needs to be integrated and that one day this "theoretical" issue that is a punch line in a blog post from Raymond or I would have far reaching design consequences led to The disunification of Norwegian and Danish sorting a few days later where I noted a "Nordic" scenario where it was happening already not so far from Sweden. The follow-up on this theoretical scenario turning real and being fixed in Vista then saw its culmination to the fix for SQL Server in The disunification of Norwegian and Danish sorting ( SQL Server 2008 Edition!).
Meanwhile, back in Malmo (a place in Sweden that I have visited several times over the years, for the festival)....
Several years prior to the fix in SQL Server, in Unicode and SQL Collations have nothing to do with each other, I pointed out to a customer who was confused about how the SQL_SwedishStd_Pref_CP1_CI_AS collation returned different results for Unicode and non-Unicode columns because Unicode columns go through the Windows collations, always.
Note how the assumption was that the Unicode column and thus the Windows collation behavior was correct.
Now, less than three years later, another customer reports:
Thanks for all the research on this issue, we really appreciate it! As I see it ‘v’ and ‘w’ should not be treated the same in the Swedish language and as was pointed out in the article http://blogs.msdn.com/michkap/archive/2006/04/25/583307.aspx referred below and subsidiary in http://www.saol.se/saol13_pres.html (written in Swedish from the Swedish Academy) the two letters ‘v’ and ‘w’ should be distinct letters and not treated as equally in e.g. sorting. That the Finnish government does not seem to have changed there meaning must mean that Sweden and Finland can not share the same collation in SQL Server. I must say that it is really strange that this has not been corrected for so long.As we see it this behaviour should be changed to the right one as soon as possible through the use of e.g. a specific Swedish collation that is different from the Finnish one. What is the process to get this to work? Does that go through Microsoft Sweden office or Connect? I guess that this is not changed over a night, so that we will have to live with this for a while, but I should definitely recommend a change. Can you help with this?
I have forwarded the information on to the appropriate owners, so this first customer report of an assumption that the suggested change has been duly noted by the people who need to know about it.
But the zeroeth customer report (to use the zero-based counting system that I recall seeing in elevators (lifts) in Sweden!) is of course not the tipping point for determining when the change is most appropriate to make -- so there will obviousl need to be some research to determine when would be the best release in which to make the change. And when it has been long enough.
This will require more than the "micro " report that made its way to this blog.
Though of course one the change is made, the fact that there is a mitigation for those not read for the change -- the fact that the Finns look like they are not changing the same way -- should ease the pain a bit! :-)
In the meantime, the customer asked if there was a workaround, a wa to get the newer behavior sooner.
I found one, reall the only one I could think of.
A way to make a letter that is not a V and that has a unique alphabetic weight that could masquerade like a "new Swedish collation style W".
It starts with ℣, aka U+2123 (VERSICLE).
It has a unique alphabetic weight just after V but before W.
In fact, it has always had such a weight, since as far back as NT 3.1!
Now there is no lowercase version (only an uppercase one), but if one built a calculated column that replaced all instance of both W and w with ℣ then indexing on that calculated column will allow every case-insensitive Swedish Windows-style collation in SQL Server to return the expected results, and every case-sensitive Swedish Windows-style collation in SQL Server to return almost the expected results.
For completeness, replacing Ŵ and ŵ (U+0174 and U+0175, aka CAPITAL and SMALL LATIN LETTER W WITH CIRCUMFLEX) with ℣ plus some diacritic (like U+0302 -- COMBINING CIRCUMFLEX ACCENT) would handle the other "W-style" letter moved by Swedish/Finnish today....
So here's to the lookout for that tipping point!
This blog brought to you by ℣ (U+2123, aka VERSICLE)
It has now been two weeks since Where's Waldo^H^H^H^H^HMichael? (aka It probably wasn't worth the wait and yet you waited!) and there has only been one blog after it.
There is a problem getting off the ground here. I think I'll try and explain why, if I can.
Did you ever watch The X-Files?
There were always two kinds of episodes:
Now the latter made up at most 1/3 of the episodes of the original series, but it is no accident that they were always the two-parters, the ones at the beginning and end of the season, and the ones to show up Sweeps.
They really were like the yeast that helped the bread of the series rise.
Well, there are the same two different kinds of blogs in this Blog -- the blogs that stand on their own and just talk about specific (often externally reported) issues, and the blogs that tie into the mythology of my random stuff of dubious value.
The lines were often blurry, even for me as I wrote, since sometimes what began as the former would turn into the latter as I investigated and found that there were larger issues. And I tried to make sure that even if you didn't care about larger issues that there would be some value in each blog since you might never read another (especially if you came in via a Google search as most of my traffic seems to).
But for me, the Mytharc blogs are the ones that drive the Blog; the standalone ones just fill in the dead space.
I know that the person asking the question I might have been answering or the people that found my answer via Google might not agree, but they are obviously feeling a bit more self-centered so it is no big deal that they have trouble seeing the importance of the other Blogs -- they probably aren't even reading them at all, or at least even close to as carefully.
And here we come to the problem, the thing that has kept me from blogging (beyond random trips to Las Vegas and Los Angeles, where I was really way too busy to blog!).
My cup runneth under when it comes to the Mythology lately, in part for the reason I talked about in Where's Waldo^H^H^H^H^HMichael? (aka It probably wasn't worth the wait and yet you waited!), but in part for the larger reason that I think I have turned the corner on the whole Liz thing (finally?) in part due to internalizing an old Henry James quote1, but am not really feeling inspired by the same larger themes that inspired me previously.
So I think I have been able to have fun and enjoy myself, but not to get the Blog thing figured out.
I kind of faked it for a year or so, writing with the larger themes I had on hand even if I wasn't feeling them the same way.
And it was a busy year -- I was going out, traveling, eating, drinking, sleeping, and everything else one does when one is living. Though I was just existing for the bulk of it.
Going through the motions.
(song title!)
Anyway, I was kind of faking it, and several regular readers would call me on that from time to time.
When you use no yeast or old yeast or (dare I say it?) stale yeast, then one thing is for sure and for certain -- the bread will not taste the same as it did.
Some would think of the last year as the largest number of CLb's (Career Limiting blogs) published since this Blog's inception -- just goes to show you the down side to baking bread with no yeast.
For most people making use of the Blog that would probably be just fine since the traffic is mostly coming from Google searches anyway. Except for one thing.
Me.
I can't write that way.
I need something to drive me -- the larger themes, the Blog's mythology. I just feel like I can't fake the yeast anymore.
There are still things going on that are interesting, and things that interest me (I mentioned several in Where's Waldo^H^H^H^H^HMichael? (aka It probably wasn't worth the wait and yet you waited!) and that is just for starters).
And people are still asking questions.
So I will keep writing on and off, covering all that stuff, while I am searching for new themes. New inspirations. New mythology.
If you are a regular reader who is disappointed by this, I'm sorry -- truly I don't want to let people down.
When I come back (and yes, I do plan to come back full strength, eventually) I may with whatever changes lose some more regular readers, the ones who would also probably visit celebrities and tell them how they used to be cool.
But if you want to hang in there, then by extension of my #1 philosophy in this blog (never write anything I wouldn't read), there is a better-than-average chance that I won't disappoint you.
Or her....
(for Liz)
1 - The quote in question: "Be not afraid of life. Believe that life is worth living, and your belief will help create the fact." - Henry James
This post brought to you by ྊ (U+0f8a, a.k.a. TIBETAN SIGN GRU CAN RGYINGS)
In one of the very first blogs I wrote, I pointed out that Microsoft does not use the Unicode Collation Algorithm.
Believe it or not, at the time some people actually asked me whether I thought I might get in trouble for that blog. Looking at it now I can't even imagine why they would have thought that -- there are so many other blogs that are much more effective at getting me into trouble, after all. I can inspire an almost Pavlovian response with certain topics, which inspire a "this is what I'm talking about" mail to some people.
Anyway....
Microsoft does not use the UCA. In fact, it still does not use the UCA.
There are consequences to this fact - that the collation model whose full time job is to attempt to implement principles in the Unicode Standard as it sorts is not the one that Microsoft does. Consequences that pop up at the most unlikely and unexpected times and can knee a guy right in the groin.
Like the other day, when I received from a guy named Ron a mail that was not as shiny and happy as REM imagined in that song of theirs that Michael Stipe hates so much:
This isn't a request for support, and I don't expect a response.I just wanted to let you know that the hot fix in http://support.microsoft.com/kb/955612 leaves some sort keys broken. I've applied that hot fix and still get broken results for the Uncode code points FE71, FE77, FD79, FE7D, and FE7F.For example the sort key for FE71 is 00 00 01 00 01 00 dc 01 01 01I would not have sent this to you except that there's no way to give any feedback on the hot fix page other than to pay $99.00 to speak to a support person.It really looks like MS doesn't want to find out when it's code is broken. Given that I used to work for MS, I find that depressing.It would be nice to have this fixed in some future release.
Hmmm. I count seven issues that kind of screamed for a response of some sort.I'm gonna try to cover them all.
I'm going to take them out of order, though.
FOURTH OF ALL, the bug. If you compare the sort keys of some of these characters across versions (the first sort key is from XP, the second is from Vista, the third is from Server 2008):
U+fe71 (ARABIC TATWEEL WITH FATHATAN ABOVE)01 01 01 01 80 07 06 a0 0040 03 40 fa 01 01 02 12 01 01 0000 00 01 00 01 00 dc 01 01 01 00
U+fe77 (ARABIC FATHA MEDIAL FORM)01 01 01 01 80 07 06 a3 0040 f8 01 00 40 dc 01 02 0d 01 02 00 12 01 01 0000 00 01 00 01 00 df 01 01 01 00
U+fe7d (ARABIC SHADDAH ON TATWEEL)ff ff 01 01 01 01 0040 ea 40 fc 01 01 01 01 0000 00 01 00 01 00 e3 01 01 01 00
U+fe7f (ARABIC SUKUN MEDIAL FORM)01 01 01 01 80 07 06 a6 0040 f2 40 fc 01 01 01 01 0000 00 01 00 01 00 e2 01 01 01 00
The explanation of each is simple enough -- the first was from that point where many of the characters had weird weights just to try to fit them somewhere since they did not exactly fit in with the one weight per character model.
The second was an attempt to at least put them with the other Arabic characters.
The third was an attempt to be more compatible with Unicode, kind of like the UCA tries to do.
Oops.
It took the documented decompositions from the Unicode Character Database, and treated them like Expansions, those things I mentioned in A&P of Sort Keys, part 5 (aka EXPANSIONing your horizons). And it turned these kind of complicated compatibility characters, the ones I have been railing against in prior blogs like
and tries to kind of rehabilitate them using these documented equivalencies, such as:
U+fe71 --> U+0640 U+064b (ARABIC TATWEEL + ARABIC FATHATAN)
U+fe77 --> U+0640 U+064e (ARABIC TATWEEL + ARABIC FATHAH)
U+fe7d --> U+0640 U+0651 (ARABIC TATWEEL + ARABIC SHADDAH)
U+fe7f --> U+0640 U+0652 (ARABIC TATWEEL + ARABIC SUKUN)
Now in Microsoft's tables, the TAWEEL is given no weight (ref: You've got to be kashidding me), and the other characters are treated as diacritics. This makes the XP weights just behind the times and the Vista weights really weird, with them being treated as full letters even though they are nominally compatible with things that are either weightless or diacritics.
Thus the first two attempts here sucked (the worst examples of How does Microsoft assign new collation weights?), and the third was a genuine attempt to do the right thing.
Unfortunately, there are at least three problems/limitations with our expansions, and this bug is due to two of them.
You see, in expansions all the usual code that does not fill in values for the weightless characters? Doesn't happen. Plus it does not properly handle combining characters (just like it does not handle compression, as I pointed out in A&P of Sort Keys, part 5 (aka EXPANSIONing your horizons)).And thus between these two problems you have all these NULLs Ron is pointing out.
This bug repros on Windows 7 by the way. Someone should get on that ASAP. Any NLS testers around? :-)
Getting back to the remaining six point in the question, now:
FIRST OF ALL, this blog is not really intended to be a support venue, so SECOND OF ALL just like no one ever expects the Spanish Inquisition, one should never expect a response.
And THIRD OF ALL, the hotfix mentioned in that KB article was for a specific targeted bug. Being unhappy at a heretofore unreported bug not being fixed in it is like being mad that Apple did not provide a patch.
FIFTH OF ALL since this bug has nothing to do with the hotfix, that would be the wrong place to leave the feedback anyway.
And SIXTH OF ALL, when I consider the notion of a former employee who has no idea where to report a bug and finds to be depressing, I myself get depressed.
I know that for the next ten years after I leave this company I would know exactly where to send bugs, even if decided not to send them. :-)
Finally, SEVENTH OF ALL, given that this bug exists in Windows 7, I too think it would be nice if it were fixed in a future version. Hint, hint!
This blog brought to you by the many fine characters mentioned above that have been so consistently mistreated by Windows despite their long-standing existence in Unicode
It has been several weeks since my last blog in/on this Blog.
If you are a regular reader, you may have noticed this already, though one of the downsides of the whole RSS/subscription model is that for the most part it is only the act of doing something gets noticed. and under the "squeaky wheel gets the grease" theory, the silent Blog is often not noticed right away.
I generally subscribe to the "squeaky wheel gets replaced" theory, myself. But that is a topic for another day, if at all. It doesn't even apply here, anyway.
So, whether you noticed or not, this blog can act as a ping and an update, of sorts.
Even as I am writing it, I realize that at least half of this blog is not the sort of thing I would read, which is a violation of my general policy about every kind of writing I have done or plan to do. I doubt you will judge me about this as harshly as I judge myself about it, but please keep it mind. :-)
Now as to why I haven't been blogging.
Well, at first I was out of town, as I mentioned (ref: After calling the airline, iBOT a ticket to Vegas!). And way too busy to write.
A topic for another day, or you can read my Facebook note about if if you are so inclined. It is my only note to date so it shouldn't be that hard to find. Treatment here will likely have different emphasis, fewer pictures, and no videos (the videos, which include iBOT tricks, are on YouTube now anyway, due to an interesting problem that may also be a topic for another way).
Then there was my job, and a bunch of work items that came out of the trip, and the attempt (at times successful) to have a life over the subsequent weeks.
Most of these three items are also topics for other days....
It is just that it has really been a year since my life was thrown into catastrophic disarray, when Liz died. And the past year I have been really pondering everything, from my Blog to my job to my career to my life.
I can't report on anything substantive that has come out of this thinking, though it has been valuable.
Most of it is also a topic for another day, if at all.
So anyway, for the record I am going to put some basic information out there for anyone who might have been concerned....
First of all, I still have a job, and was not one of the 1.5% of Microsoft that was laid off recently.
The fact that so many people sent email to my Microsoft email address to ask me if I still had a job is a topic for another day, if at all -- some careful use of logic should explain why this does not make very much sense!
There are several factors related to the layoff that I have opinions about but for now I'm not going to share most of them since it is a sensitive subject and I thyink I'd just piss people off.
Second of all, my job is pretty much the same as what it was. No one in our group was laid off and as far as I know no one's job in the group was altered as a side effect of the layoff.
Though I did have to waste several days of work because of it.
I should explain that.
You see, I was working on a presentation for a group whose first version work was kind of internationally flawed and is being rewritten for their second version, and the PM thought it would be good to get them trained from the start on some of the globalization basics.That PM knew some of it and wanted to learn the rest, and also wanted the team to learn it so they could be right from the very beginning this time. Contact was wiyj bothe me and a PM on our side, but I was asked to do the work on behalf of both.
The fact that it feels good to have your colleagues trust you to represent them in such situations could also be a topic for another day, thoigh probably not until/unless I am working somewhere else -- I can't have PMs I work with thinking they have my feeling good, it would hurt my reputation!
Unfortunately that PM was laid off.
Which gave me a serious sense of vujà dé (my own little personal back formation that I define as being sure one has experienced something before, and that they didn't like it), thinking that contrary to my thoughts in Not so small as to be internationally stupid, Mini might have a point. I mean, if a group tasked with reducing jobs chooses to eliminate both the position and the work of someone who was trying to make sure globalization issues are covered, then perhaps someone agrees with Mini that is one of the reasons why Microsoft is so big. And perhaps that could be on the chopping block when such choices exist and must be made.
Of course I know that it is unlikely that this was the only work this PM had and thus unlikely to lay this out as any kind of strategic thought. It is much more likely to be short-sightedness that someone will either spend ten times as much money (or more) to correct later or customers will pay the price and perhaps inspire QFEs that cost thirty times as much money (or more) to fix then. But this would hardly be the first time that a wrong decision was made in the tech. industry across any company, so that is hardly anything new.
Wiser heads, had they been asked, might have counseled that the training could still have happened, but this was not the only thing changed in the group so that apparently wasn't practical -- there are bigger fish to fry, so to speak.
The net impact is that I lost a few days, but since someone else lost their job I'm not going to worry so much about the days.
In fact, if that PM wants a recommendation and/or reference inside or outside of MS, I made it clear I'd do it. The entire issue, starting from when it was first brought up, was done professionally and technically with a methodical understanding of many of the previous mistakes, an understanding of a need to know more, a recognition of the need to address them, and a passion for both quality and correct behavior that anyone interested in good results would be proud to have.
This is, I think, a case of "friendly fire" incidental to the layoff itself. If I were hiring, this person wouldn't need to look for a job.
Had it been up to me it wouldn't have been my choice if I were given the directive to lay people off, but there are probably very good reasons why executives don't request that list from me (even beyond the fact that they probably don't know it exists). Let's leave it as read that my list would have been significantly shorter than the one they used, but mine still would have saved Microsoft more money.
But anyway, I said I wasn't going to talk about the layoff. So I'll stop now.
Back to my list....
Third of all, this Blog has a flaw in it.
It starts with the name.
Sorting it all Out.
Now taking my interest in collation and moving it to a multi-level pun is to my way of thinking kind of cool. Not as cool in my opinion as the WinFS Team Blog's name (What's in Store), but with the same kind of intent.
Certainly not as cool as the whole GM-approved Architect of Sorts job title thing, which I still think is pretty cute.
But then it is therein that the problem lies.
I mean, in the 2887 or so blogs in this Blog, only some have had anything to do with collation in the general sense, and most of the rest were not about any kind of attempt to order, cross-reference, or categorize information in any of the senses that either sorting or collation would cover.
Me? I'm the sort of person who prefers the Exchange client/Schedule+ combo to Outlook, because I liked the ability to say yes to every meeting invite (which made the meeting owner feel good) and also miss every meeting I didn't feel like going to (with the affirmative defense of "oh sorry, I didn't have Schedule+ running!" if anyone asked why I wasn't there). I don't need my life to be as orderly as all that.
So here in this blog, I have been trying to do so many things:
And so much of that is unordered, and it is prioritized (if you can call it that) only by a disconnected yet occasionally logical process that I have come to think of as disassociative linkage (hat tip to colleague Jennifer Gentleman who is the first person I used the term on, to describe the thought process I witnessed on a couple of occasions!).
Which is to say, not really prioritized in any real sense that can be measured or referenced or cataloged or explained or understood.
So I think it might time to change the name of this Blog.
You know, to set aside the cute joke and try to come up with something that really does capture things a better.
I'm still working on this, and of course lots of other things. And that will be the topic for another day. One I might actually get to! :-)
All of which leads to the bigger issue, one that the name issue kind distracts me from. This is whether the Blog is the best way for me to accomplish whatever it is I'm trying to do.
For now I think it serves a need, and I am really reluctant to not fill that need just because I'm feeling uncertain about what to do. So within the next few days I'll be putting all of the blogs that were going to go up back into rotation, and maybe I'll even cover some of the topics mentioned above, to the delight and/or dismay of some of the regular readers.
Things should be back to normal (whatever that is) soon, provided I still have a job after the PTB read this blog, of course....