• The Old New Thing

    Converting from traditional to simplified Chinese, part 3: Highlighting differences


    One of the things that is interesting to me as a student of the Chinese languages is to recognize where the traditional and simplified Chinese scripts differ. Since this is my program, I'm going to hard-code the color for simplified Chinese script: maroon.

    To accomplish the highlighting, we take advantage of listview's custom-draw feature. Custom-draw allows you to make minor changes to the way items are displayed on the screen. It's a middle ground between having listview do all the work (via default drawing behavior) and having the program do all the work (via owner-draw).

    The custom-draw cycle for shell common controls consists of series of NM_CUSTOMDRAW notifications, starting with the most general and getting more specific. The reason for the break-down is multi-fold. First, it allows the listview control to short-circuit a portion of custom-draw behavior if the parent window does not indicate that it wishes to customize a particular behavior. This reduces message traffic and improves performance when large numbers of items are being drawn. Second, it allows the parent window to target its customizations to the drawing stages it is interested in.

    Listviews are peculiar among the shell common controls in that its items sometimes (but not always) have sub-items. This complicates the drawing process since it requires listview to accomodate both styles: large icon view does not use sub-items, but report view does. To address this, the CDDS_ITEMPREPAINT stage is entered when an item is about to paint, and any changes made by the parent window are considered to be effective for the entire item. If you want to make changes on a per-subitem basis, you have to respond to CDDS_ITEMPREPAINT | CDDS_SUBITEM and set your properties (or reset them if you want to return to the default) for that sub-item.

    With those preliminary remarks settled, we can dive in.

    class RootWindow : public Window
     HWND m_hwndLV;
     COLORREF m_clrTextNormal;
     Dictionary m_dict;

    We declare our listview custom-draw handler as well as the member variable in which we remember the normal text color so that we can reset it for columns we do not intend to colorize.

    LRESULT RootWindow::OnNotify(NMHDR *pnm)
     switch (pnm->code) {
     case NM_CUSTOMDRAW:
      if (pnm->hwndFrom == m_hwndLV) {
       return OnLVCustomDraw(CONTAINING_RECORD(
                             CONTAINING_RECORD(pnm, NMCUSTOMDRAW, hdr),
                                                    NMLVCUSTOMDRAW, nmcd));
     return 0;

    If we receive a NM_CUSTOMDRAW notification from the listview control, we call our new handler. The multiple calls to the CONTAINING_RECORD macro are necessary because the NMHDR structure is nestled two levels deep inside the NMLVCUSTOMDRAW structure.

    LRESULT RootWindow::OnLVCustomDraw(NMLVCUSTOMDRAW* pcd)
     switch (pcd->nmcd.dwDrawStage) {
      m_clrTextNormal = pcd->clrText;
      pcd->clrText = m_clrTextNormal;
      if (pcd->iSubItem == COL_SIMP &&
        pcd->nmcd.dwItemSpec < (DWORD)Length()) {
        const DictionaryEntry& de = Item(pcd->nmcd.dwItemSpec);
        if (de.m_pszSimp) {
          pcd->clrText = RGB(0x80, 0x00, 0x00);
     return CDRF_DODEFAULT;

    During the CDDS_PREPAINT stage, we indicate our desire to receive CDDS_ITEMPREPAINT notifications. During the CDDS_ITEMPREPAINT stage, we save the normal text color and indicate that we want to receive sub-item notifications. It is in the sub-item notification CDDS_ITEMPREPAINT | CDDS_SUBITEM that the real work happens.

    First, we reset the color to the default on the assumption that we will not need to colorize this column. But if the column is the simplified Chinese column, if the item number is valid, and if the simplified Chinese is different from the traditional Chinese, then we set the text color to maroon.

    That's enough with the Chinese/English dictionary for now. All this time, and we don't even have search capability yet! We'll work on that next month.

  • The Old New Thing

    Converting from traditional to simplified Chinese, part 2: Using the dictionary


    Now that we have our traditional-to-simplified pseudo-dictionary, we can use it to generate simplified Chinese words in our Chinese/English dictionary.

    class StringPool
     LPWSTR AllocString(const WCHAR* pszBegin, const WCHAR* pszEnd);
     LPWSTR DupString(const WCHAR* pszBegin)
      return AllocString(pszBegin, pszBegin + lstrlen(pszBegin));

    The DupString method is a convenience we will use below.

        if (de.Parse(buf, buf + cchResult, m_pool)) {
         bool fSimp = false;
         for (int i = 0; de.m_pszTrad[i]; i++) {
          if (pmap->Map(de.m_pszTrad[i])) {
           fSimp = true;
         if (fSimp) {
          de.m_pszSimp = m_pool.DupString(de.m_pszTrad);
          for (int i = 0; de.m_pszTrad[i]; i++) {
           if (pmap->Map(de.m_pszTrad[i])) {
            de.m_pszSimp[i] = pmap->Map(de.m_pszTrad[i]);
         } else {
          de.m_pszSimp = NULL;

    After we parse each entry from the dictionary, we scan the traditional Chinese characters to see if any of them have been simplified. If so, then we copy the traditional Chinese string and use the Trad2Simp object to convert it to simplified Chinese.

    If the string is the same in both simplified and traditional Chinese, then we set m_pszSimp to NULL. This may seem a bit odd, but it'll come in handy later. Yes, it makes the m_pszSimp member difficult to use. I could have created an accessor function for it (so that it falls back to traditional Chinese if the simplified Chinese is NULL), but I'm feeling lazy right now, and this is just a one-shot program.

    void RootWindow::OnGetDispInfo(NMLVDISPINFO* pnmv)
      switch (pnmv->item.iSubItem) {
       case COL_TRAD:    pszResult = de.m_pszTrad;    break;
       case COL_SIMP:    pszResult =
          de.m_pszSimp ? de.m_pszSimp : de.m_pszTrad; break;
       case COL_PINYIN:  pszResult = de.m_pszPinyin;  break;
       case COL_ENGLISH: pszResult = de.m_pszEnglish; break;

    Finally, we tell our OnGetDispInfo handler what to return when the listview asks for the text that goes into the simplified Chinese column. With these changes, we can display both the traditional and simplified Chinese for each entry in our dictionary.

    Next time, a minor tweak to our display code, which happens to illustrate custom-draw as a nice side-effect.

  • The Old New Thing

    Converting from traditional to simplified Chinese, part 1: Loading the dictionary


    One step we had glossed over in our haste to get something interesting on the screen in our Chinese/English dictionary program was the conversion from traditional to simplified Chinese characters.

    The format of the hcutf8.txt file is a series of lines, each of which is a UTF-8 encoded string consisting of a simplified Chinese character followed by its traditional equivalents. Often, multiple traditional characters map to a single simplified character. Much more rarely—only twice in our data set—multiple simplified characters map to a single traditional character. Unfortunately, one of the cases is the common syllable 麼, which has two simplifications, either 么 or 麽, the first of which is far more productive. We'll have to keep an eye out for that one.

    (Note also that in real life, the mapping is more complicated than a character-for-character substitution, but I'm willing to forego that level of complexity because this is just for my personal use and people will have realized I'm not a native speaker long before I get caught up in language subtleties like that.)

    One could try to work out a fancy data structure to represent this mapping table compactly, but it turns out that simple is better here: an array of 65536 WCHARs, each producing the corresponding simplification. Most of the array will lie unused, since the characters we are interested in lie in the range U+4E00 to U+9FFF. Consequently, the active part of the table is only about 40Kb, which easily fits inside the L2 cache.

    It is important to know when a simple data structure is better than a complex one.

    The hcutf8.txt file contains a lot of fluff that we aren't interested in. Let's strip that out ahead of time so that we don't waste our time parsing it at run-time.

    $_ = <> until /^# Start zi/; # ignore uninteresting characters
    while (<>) {
     next if length($_) == 7 &&
             substr($_, 0, 3) eq substr($_, 3, 3); # ignore NOPs

    Run the hcutf8.txt file through this filter to clean it up a bit.

    Now we can write our "traditional to simplified" dictionary.

    class Trad2Simp
     WCHAR Map(WCHAR chTrad) const { return _rgwch[chTrad]; }
     WCHAR _rgwch[65536]; // woohoo!
     ZeroMemory(_rgwch, sizeof(_rgwch));
     MappedTextFile mtf(TEXT("hcutf8.txt"));
     const CHAR* pchBuf = mtf.Buffer();
     const CHAR* pchEnd = pchBuf + mtf.Length();
     while (pchBuf < pchEnd) {
      const CHAR* pchCR = std::find(pchBuf, pchEnd, '\r');
      int cchBuf = (int)(pchCR - pchBuf);
      WCHAR szMap[80];
      DWORD cch = MultiByteToWideChar(CP_UTF8, 0, pchBuf, cchBuf,
                                      szMap, 80);
      if (cch > 1) {
       WCHAR chSimp = szMap[0];
       for (DWORD i = 1; i < cch; i++) {
        if (szMap[i] != chSimp) {
         _rgwch[szMap[i]] = chSimp;
       pchBuf = std::find(pchCR, pchEnd, '\n') + 1;
     _rgwch[0x9EBC] = 0x4E48;

    We read the file one line at a time, convert it from UTF-8, and for each nontrivial mapping, record it in our dictionary. At the end, we do our little 么 special-case patch-up.

    Next time, we'll use this mapping table to generate simplified Chinese characters into our dictionary.

  • The Old New Thing

    The best book on ActiveX programming ever written


    I was introduced to the glory that is the world of Mr. Bunny many years ago. Mr. Bunny's Guide to ActiveX is probably the best book on ActiveX programming ever written.

    If you haven't figured it out by now, it's a humor book, but it's the sort of madcap insane geek humor that has enough truth in it to make you laugh more.

    My favorite is the first exercise from the first chapter: Connect the dots. (Warning: It's harder than it looks!)

  • The Old New Thing

    How can I recover the dialog resource ID from a dialog window handle?


    Occasionally, I see someone ask a question like the following.

    I have the handle to a dialog window. How can I get the original dialog resource ID that the dialog was created from?

    As we saw in our in-depth discussion of how dialogs are created from dialog templates, the dialog template itself is not saved anywhere. The purpose of a template is to act as the... well... "template" for creating a dialog box. Once the dialog box has been created, there is no need for the template any more. Consequently, there is no reason why the system should remember it.

    Besides, if the dialog were created from a runtime-generated template, saving the original parameters would leave pointers to freed memory. Furthermore, the code that created the dialog box almost certainly modified the dialog box during its WM_INITDIALOG message processing (filling list boxes with data, maybe enabling or disabling some buttons), so the dialog box you see on screen doesn't correspond to a template anywhere.

    It's like asking, "Given a plate of food, how do I recover the original cookbook and page number for the recipe?" By doing a chemical analysis of the food, you might be able to recover "a" recipe, but there is nothing in the food itself that says, "I came from The Joy of Cooking, page 253."

  • The Old New Thing

    What struck me about life in the Republic


    When people asked me for my reaction to the most recent Star Wars movie, I replied that what struck me most was that the Republic doesn't appear to have any building codes. There are these platforms several hundred meters above the ground with no railings. For example, Padmé Amidala's fancy apartment has a front porch far above the ground. Consider: You're carrying a load of packages to the car, the kids are running around, you turn around to yell at one of them, miss a step, and over the rim you go. How many people fall to their deaths in that galaxy?

  • The Old New Thing



    Among the things you can get with the GetStockObject function are two fonts called SYSTEM_FONT and DEFAULT_GUI_FONT. What are they?

    They are fonts nobody uses any more.

    Back in the old days of Windows 2.0, the font used for dialog boxes was a bitmap font called System. This is the font that SYSTEM_FONT retrieves, and it is still the default dialog box font for compatibility reasons. Of course, nobody nowadays would ever use such an ugly font for their dialog boxes. (Among other things, it's a bitmap font and therefore does not look good at high resolutions, nor can it be anti-aliased.)

    DEFAULT_GUI_FONT has an even less illustrious history. It was created during Windows 95 development in the hopes of becoming the new default GUI font, but by July 1994, Windows itself stopped using it in favor of the various fonts returned by the SystemParametersInfo function. Its existence is now vestigial.

    One major gotcha with SYSTEM_FONT and DEFAULT_GUI_FONT is that on a typical US-English machine, they map to bitmap fonts that do not support ClearType.

  • The Old New Thing

    What's the point of DeferWindowPos?


    The purpose of the DeferWindowPos function is to move multiple child windows at one go. This reduces somewhat the amount of repainting that goes on when windows move around.

    Take that DC brush sample from a few months ago and make the following changes:

    HWND g_hwndChildren[2];
    OnCreate(HWND hwnd, LPCREATESTRUCT lpcs)
     const static COLORREF s_rgclr[2] =
        { RGB(255,0,0), RGB(0,255,0) };
     for (int i = 0; i < 2; i++) {
      g_hwndChildren[i] = CreateWindow(TEXT("static"), NULL,
            WS_VISIBLE | WS_CHILD, 0, 0, 0, 0,
            hwnd, (HMENU)IntToPtr(s_rgclr[i]), g_hinst, 0);
      if (!g_hwndChildren[i]) return FALSE;
     return TRUE;

    Notice that I'm using the control ID to hold the desired color. We retrieve it when choosing our background color.

    HBRUSH OnCtlColor(HWND hwnd, HDC hdc, HWND hwndChild, int type)
      SetDCBrushColor(hdc, (COLORREF)GetDlgCtrlID(hwndChild));
      return GetStockBrush(DC_BRUSH);
        HANDLE_MSG(hwnd, WM_CTLCOLORSTATIC, OnCtlColor);

    I threw in a half-second sleep. This will make the painting a little easier to see.

    OnSize(HWND hwnd, UINT state, int cx, int cy)
      int cxHalf = cx/2;
                   NULL, 0, 0, cxHalf, cy,
                   NULL, cxHalf, 0, cx-cxHalf, cy,

    We place the two child windows side by side in our client area. For our first pass, we'll use the SetWindowPos function to position the windows.

    Compile and run this program, and once it's up, click the maximize box. Observe carefully which parts of the green rectangle get repainted.

    Now let's change our positioning code to use the DeferWindowPos function. The usage pattern for the deferred window positioning functions is as follows:

    HDWP hdwp = BeginDeferWindowPos(n);
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 1 [fixed 7/7]
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 2
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // 3
    if (hdwp) hdwp = DeferWindowPos(hdwp, ...); // n
    if (hdwp) EndDeferWindowPos(hdwp);

    There are some key points here.

    • The value you pass to the BeginDeferWindowPos function is the number of windows you intend to move. It's okay if you get this value wrong, but getting it right will reduce the number of internal reallocations.
    • The return value from DeferWindowPos is stored back into the hdwp because the return value is not necessarily the same as the value originally passed in. If the deferral bookkeeping needs to perform a reallocation, the DeferWindowPos function returns a handle to the new defer information; the old defer information is no longer valid. What's more, if the deferral fails, the old defer information is destroyed. This is different from the realloc function which leaves the original object unchanged if the reallocation fails. The pattern p = realloc(p, ...) is a memory leak, but the pattern hdwp = DeferWindowPos(hdwp, ...) is not.

    That second point is important. Many people get it wrong.

    Okay, now that you're all probably scared of this function, let's change our repositioning code to take advantage of deferred window positioning. It's really not that hard at all. (Save these changes to a new file, though. We'll want to run the old and new versions side by side.)

    OnSize(HWND hwnd, UINT state, int cx, int cy)
      HDWP hdwp = BeginDeferWindowPos(2);
      int cxHalf = cx/2;
      if (hdwp) hdwp = DeferWindowPos(hdwp, g_hwndChildren[0],
                   NULL, 0, 0, cxHalf, cy,
      if (hdwp) hdwp = DeferWindowPos(hdwp, g_hwndChildren[1],
                   NULL, cxHalf, 0, cx-cxHalf, cy,
      if (hdwp) EndDeferWindowPos(hdwp);

    Compile and run this program, and again, once it's up, maximize the window and observe which regions repaint. Observe that there is slightly less repainting in the new version compared to the old version.

  • The Old New Thing

    Answers to yesterday's holiday fun puzzles


    Puzzle 1: This was a word search consisting of the names of the twelve streets of central downtown Seattle. The unused letters spell out the message "Issaquah year's supply hair conditioner", which takes you to the Issaquah Costco. (In the real puzzle, the secret message was much odder but relied on an inside joke.)

    Puzzle 2: The cryptogram decodes as follows:

    "There's no such thing as a stupid question." Go to the information desk at Center House and ask if they have any tickets available for the Mariners game.

    This was itself a bit of an inside joke, because my friend worked at the information desk at Center House and had to answer stupid questions like this one. (The Mariners play at Safeco Field, not Seattle Center.)

    The clues at the end told you how to map each letter. For example, the first clue "adieu" says that the letter "A" in the cryptogram maps to "U" in the cleartext. The cryptogram was easy enough that my friends didn't need the bonus help, but in case you did, here are the answers: adieu, brook, coolj, dweeb, essay, fungi, genre, humor, incus, johnq, kazoo, lyric, mymtv, novel, overt, poach, quaff, rolex, soyuz, turow, usurp, venom, wrong, xenon, yield, zebra.

    Puzzle 3: This is a double-acrostic puzzle. (Instructions on how to solve a double-acrostic.)

    Give every book fifty pages before you commit to it or give it up. If you're over fifty, take your age and subtract it from one hundred—the result is the number of pages you should read before deciding. Time is too short to read something you don't like.

    The answers to the clues are as follows:

    1. yo-yo diet
    2. overdue
    3. University of Illinois
    4. refought
    5. fondue pot (my friend likes fondue)
    6. indecisive
    7. red-eye
    8. soggy (my friend hates soggy corn flakes)
    9. The Herb Farm
    10. stork (original clue referenced friends who are expecting their first child)
    11. eco-tourism
    12. abbey
    13. tiff
    14. thud
    15. Love-Sac (my friend has one in her living room)
    16. eighth (original clue used Seattle library trivia)
    17. audio-book
    18. pink
    19. Asteroid (original clue referenced a meal we had there)
    20. roommate (original clue referenced her current one)
    21. tagged (she plays softball)
    22. Metro bus route fourteen (with which she's very familiar)
    23. effigy
    24. notary (my friend is a notary too)
    25. trumpet

    The secret message is "Your first Seattle apartment", where she was greeted by her first roomate!

    As you can see, a lot of the clues used inside information. There are also several library-related clues since my friend volunteers for the Seattle Public Library, and the quotation itself was a bit of a gimme because my friend is a huge Nancy Pearl fan.

    Puzzle 4: A straightforward Jumble with a Seattle Center theme. Key Arena, Center House, Space Needle, Monorail and Fun Forest are the anagrams, leading to the destination Earth & Ocean, my friend's favorite dessert restaurant. At Earth and Ocean, she was treated to lunch including one of every dessert on the menu and a special visit from the dessert chef herself.

    Puzzle 5: The solution to the riddle is "Sim + foe + knee = symphony", which led her to Benaroya Hall.

    Puzzle 6: The first series consists of baseball-related terms: shortstop, pinch hitter, home run, umpire, triple, strikeout, sacrifice, infield, line drive, and center field. The second series consists of names of teams in the NBA: Super Sonics, Pistons, Hornets, Rockets, Celtics, Clippers, Cavaliers, Mavericks, Timber Wolves, and Trail Blazers. The final words are "CENTRAL LIBRARY", which of course takes her to Seattle Central Library, where the big party awaited her.

    This was a very busy day for me, constantly tracking my friend's progress through the puzzles, calling all her friends to make sure they were in position, then calling them again after she left to tell them where the party was going to be. (Didn't want to risk them letting slip the final location with a casual remark like, "See you at the library!") Amazingly, she stayed pretty close to the schedule I had sketched out, except at the very end where we needed to stall her for about a half an hour so she wouldn't show up before her party guests!

  • The Old New Thing

    Using script to query information from Internet Explorer windows


    Some time ago, we used C++ to query information from the ShellWindows object and found it straightforward but cumbersome.

    This is rather clumsy from C++ because the ShellWindows object was designed for use by a scripting language like JScript or Visual Basic.

    Let's use one of the languages the ShellWindows object was designed for to enumerate all the open shell windows. Run it with the command line cscript sample.js.

    var shellWindows = new ActiveXObject("Shell.Application").Windows();
    for (var i = 0; i < shellWindows.Count; i++) {
      var w = shellWindows.Item(i);
      WScript.StdOut.WriteLine(w.LocationName + "=" + w.LocationURL);

    Well that was quite a bit shorter, wasn't it!

Page 384 of 464 (4,639 items) «382383384385386»