Welcome to MSDN Blogs Sign in | Join | Help

January 2005 - Posts

Care to take a break from the wonders of Unicode and elaborate on the pain that is the console codepage 437? Read More...
I have been reading people all over the internet who hate that Microsoft is perhaps in the future going to limit Windows Update to legal copies of Windows (Automatic Update would be their only option) with the Windows Genuine Advantage program (more info Read More...
Last Friday, Jochen Kalmbach , in response to A little bit about the new CharUnicodeInfo class , asked the following : By the way: is there some equivalent to FoldString , especially "MAP_PRECOMPOSED" and "MAP_COMPOSITE"? Neither StringInfo nor TextInfo Read More...
(special thanks to James for pointing out this bug) It is amazing how sometimes one can be so busy trying to make a point that one can miss the point. A few days ago, I pointed out that CharNext(ch) != ch+1, a lot of the time . That ought to be true. Read More...
The IsTextUnicode API has been around since NT 3.5, according to the Platform SDK histories . According to the PSDK, its purpose is as follows: The IsTextUnicode function determines whether a buffer is likely to contain a form of Unicode text. The function Read More...
This character has an interesting history. As noted by Roozbeh Pournader: Neither Farsi, nor a symbol. In real life, it is the official emblem of the goverment of the Islamic Republic of Iran. Technically that would make it a logo and thus not a suitable Read More...
A few days ago, Larry Osterman pointed out Why is Control-Alt-Delete the secure attention sequence (SAS)? It is funny but one popular topic that comes up in supporting MSKLC is people wanting to be able to develop a keyboard layout that blocks the keystroke Read More...
CharUnicodeInfo is a new class that is being added to Whidbey. It has one very straightforward job -- pick up property information from the Unicode Character Database. But there is a lot of data there! The name provides the proper balance between being Read More...
Chris Walker mentioned to me yesterday something I did not know about Notepad -- that it uses the FoldString API with the MAP_FOLDDIGITS flag. This takes all of the digits in Unicode and folds them down into regular old zero to nine for everything you Read More...
As the guy from The Princess Bride said, I do not think that word means what you think it means.... you ever find yourself feeling that way? Localization is one of the words, mainly because people mix up basic terms and assume they mean what other terms Read More...
In Fall 2004, Cathy Wissink and I were in San Jose at the Unicode Technical Committee meeting (being held at Apple) along with 20+ of our colleagues from various companies involved with internationalization. We spoke at the IMUG ( International Mac User's Read More...
Yesterday, Brad Abrams asked Do all programmers speak english? The answer, which I learned in part from the volunteer efforts to translate/localize large parts of the Trigeminal website , is no. Not all of them do. I learned (and continue to learn!) many Read More...
I promised I would let people know when I was speaking next.... I will be giving a talk at the .NET Developer's Association on February 14, 2004 in Redmond, WA, USA. I'll be talking about a lot of the exciting globalization features that are new in Whidbey. Read More...
Tor Lillqvist suggested I talk a little about the difference between CAPSLOCK and ShiftLock -- Something you might want to blog about some day: the difference between "CapsLock" and "ShiftLock". Especially determining programmatically whether the "VK_CAPITAL" Read More...
(This post is based on one that was sent to the microsoft.public.platformsdk.mslayerforunicode newsgroup back in June of 2001) It all started simply enough: the Windows division decided to fund a project for a Unicode Layer for Win9x to help people be Read More...
Early last year, Raymond Chen talked about how Char.IsDigit matches more than just 0 through 9 and later last year I talked about Crossing the DIGIT al divide . But in both cases the conversation is limited to digits, and not the wide world of numbers Read More...
In my prior post about Comparison confusion: INVARIANT vs. ORDINAL , I talked about the meaning of ordinal comparisons. They are the same as what the C Runtime calls a lexicographic comparison in functions like strcmp and wcscmp -- basically a binary Read More...
Lionel Fourquaux asks: As a member of a student group that provides some help with MS products, I've been trying to create a keyboard layout that would give a (partial) answer to a common complaint from Unix users: how to get a "Compose" key in Windows. Read More...
Code pages are out there, and they are important. A huge amount of legacy data exists in them, and we have to convert them all to Unicode to get anything done on Windows 2000, XP, Server 2003, or Longhorn. But sometimes they are not designed very smartly. Read More...
Josh has helped yet another blog with his post about fixing up the blog to look better! Mine is still a work in progress, of course. I'll play with colors another day. One strange problem I had was that I was unable to use the Custom CSS Selectors section.... Read More...
Of course if you asked any rational person whether a keyboard was hardware or software, they would say hardware. Whether it plugs into serial, PS/2, USB. FireWire, or it is wireless, it is most certainly a hardware device. (We'll ignore the case of the Read More...
(The alternate title should be spoken with either a circa-1982 Jeff Spicoli or circa-1989 Theodore "Ted" Logan mannerism and accent) U+feff has two jobs in the Unicode standard: Job #1 , and its namesake, is as a ZERO WIDTH NO-BREAK SPACE. The name kind Read More...
I'm not entirely sure how I feel about last night's West Wing episode ( 365 Days ) just yet. I'm going to type/dictate here about it for a bit and maybe I will know by the time I am done. There are two sections in particular that interested me.... Leo Read More...
Although collation on Windows gives a weight to every single code point 1 , there are times that this does not really have an intuitive meaning. What I mean to say is that there are times that the question " does string1 equal string2? " may have meaning Read More...
Dead keys? As my friend Cathy likes to say -- they're not dead, and they aren't keys. A little over a month ago, I pointed out that Dead keys are not intuitive . And nothing has changed since then -- they are still not very intuitive. And they're not Read More...
(the title was inspired by a decade and a half of Law & Order on NBC, then A&E, and now TNT!) I don't want to knock collation on Windows, because I think it rocks. It covers a lot of territory, and it gets the job done (and done well) in a lot Read More...
Let's face it, sometimes a language has complex typographical issues in it. If it were all easy to do then typography would not be such a complex and lucrative field that requires both technical understanding of language and artistry which I do not myself Read More...
I just noticed that there are lots of folks like planet.xmlhack.com that literally do not seem to have any content other than the feeds of the blogs they like. And I am sometimes randomly one of them. Of course this is no different than the stuff that Read More...
I am sorry, but for reasons that I cannot explain I seem unable to make a posting about case without involving a pun in the title of the posting. To wit: Get off my [lower] case! (or: Casing, the 1st) The [Upper]Case of the Turkish İ (or: Casing, the Read More...
Tor Lillqvist noted that in some of my previous entries on casing (cf: Get off my [lower] case! (or: Casing, the 1st) and The [Upper]Case of the Turkish İ (or: Casing, the 2nd) ) I made some hints about the casing table and NTFS. He goes on to ask: The Read More...
Ok, now appearing on Sorting It All Out , over on the left side of the screen, are search links for Google and MSN, as well as a link for searching for Unicode information at http://FileFormat.Info (special thanks to Andrew for permission!). I am not Read More...
My "random stuff of dubious value" will soon have search capabilities. How soon? Well, as soon as I can enrich my meager HTML skills a little. My expertise for this stuff end at the charset.... Read More...
You can all really tell I like calendars on Win32, can't you? :-) Well, if you are using the Gregorian calendar, then you have a good support story. You have full localization into every language. And I am not just talking about the Windows UI language Read More...
Obviously to some, claiming that calendars are just there for show may be a grating overstatement. But if you think about it, there are only four major purposes for the OS to support calendars: 1) Feed data for date formatting APIs so that dates that Read More...
Earlier today, Raymond Chen sent me a piece of email that mentioned an important point for developers who iterate a string one character at a time. Its a lot more interesting than what I was going to post so I'll do the boring one later or tomorrow (or Read More...
A few days ago I posted about the Updated EULA for the Microsoft Layer for Unicode (MSLU) and one person left the following comment : Unfortunately, it seems that the new EULA for MSLU is still not compatible with open source software. I know that MS Read More...
Today we'll talk about U+0138 , LATIN SMALL LETTER KRA. It looks like this: "ĸ". It has many interesting characteristics. For example, Latin letters have case and usually both and upper- and lower-case form, whereas LATIN SMALL LETTER KRA does not have Read More...
Someone pointed me to a note in Robert Hensing 's blog entitled Miscreant hiding techniques: Would the real explorer.exe please stand up? And the relevance of 1979 when doing searches . . . The first part of his post talked about a machine that had a Read More...
The five characters in question are APL Functional Symbols (if you have either Arial Unicode MS or Code2000 on your machine, you will see the characters; if not then you can look for them in the Miscellaneous Technical block on the Unicode site): ⍊ U+234a Read More...
I just read Ryan 's post Encodings in Strings are Evil Things (Part 7) , and all I do is hope that anyone who reads it will have also at the very least read Part 5 of the series so they know that sometimes these code points do not represent whole characters, Read More...
This seemed like a useful tip to share, originally posted in the newsgroup.... The question that was asked: I've internationalized my product. From most of the reading I've done, it seems that the satellite assemblies need to be built "by hand" and so Read More...
This article will be the first in an occasional recurring series of articles that point out little-known historical character stories in Unicode. The character in question is U+213a, ROTATED CAPITAL Q. It is a Capital Q turned 90° counter-clockwise. It Read More...
The other day I was reading Rob Earhart 's post on Hungarian and four things occurred to me: 1) Rob is right about the fact that in the kernel mode code that is not the naming convention they use. 2) I do like Hungarian notation myself. And there are Read More...
Kind of an answer to Robert Scoble's plea to please write better RSS headlines . Sometimes my posts will have relevant titles, sometimes they may even have relevant content. But as I told people years ago when they asked me why I was so into this international Read More...
A few days ago I mentioned the new compiler error C4819 for C/C++. When I did so, I quoted the meaning of the error: C4819 occurs when an ANSI source file is compiled on a system with a codepage that cannot represent all characters in the file. A few Read More...
This is a fun one. :-) Every time I am updating my keyboards, the keyboard list that showed up on my keyboards in the logon dialog also was updated. I never really thought about it before. And I used to own the code for Regional and Language Options. Read More...
A few days ago, PEK asked in the newsgroups about WideCharToMultiByte and MultiByteToWideChar : I'm a bit confused about the first parameter in MultiBytToWideChar. It is telling which code-page to use. You could use the value CP_ACP ("ANSI Code page"), Read More...
The West Wing was interesting last night and much more topical for me as Jed Bartlett described my biggest (well, most visible) symptom without naming it -- disequilbrium. Looking it up on dictionary.com finds the following definition: Loss or lack of Read More...
I was looking at Elyasse's Weblog and was reminded of one of the coolest feature entries in Whidbey. I think I have been waiting roughly 112 versions of the Microsoft compilers for this. Well, probably not that many but it does feel like that.... New Read More...
Earlier this morning, Peter Ibbotson asked me: Perhaps you explain something to me (I appreciate you may be the wrong person to ask, but you seem to be a sorting expert). When XP was in beta I put a bug report on this odd sorting behaviour with explorer, Read More...
Today someone had sent mail around noting that a toString() call on a date in 1911 threw an ArgumentOutOfRangeException when the Taiwan calendar was used and asking if it was a known issue. It is, because the calendar started in 1912. One of a zillion Read More...
A couple of definitions here -- though of course these are not official throughout MS (there is no common standard definition across all of Microsoft, nor could there be!). Spec -- The specification for a feature. Usually written by a Program Manager Read More...
This post is another of the series about international and non-international issues surrounding keyboards, MSKLC , and accessibility. Through them I will deal with issues important to developers in their application, issues important to keyboard authors Read More...
Back in May of 2004, someone asked the following question in the newsgroups: I'm programming in ASP.NET. How could I convert my date to UK format ? Thanks in advance. To which someone else suggested the following code: DateTime x=DateTime.Now; string Read More...
At last, the longstanding problems with end user license agreement (EULA) for MSLU have been fixed. You can look at the text of the EULA here or you can get it by downloading MSLU here . This new EULA meets the intent that our team has had for MSLU since Read More...
Believe it or not, I have gotten this question with language as bad as (and often worse than!) the title line of this post. The quick answer, as I am sure might have guessed, is no. :-) For the longer answer, I'll go to the FAQ section of the white paper Read More...
One of the 14 people who read this blog was reading my recent post " How do sort keys work? " and he thought that he may have found a bug. He didn't, but it was an interesting point that he brought up. So I thought I'd mention it here. (Note to all -- Read More...
 
Page view tracker