May, 2011

  • The Old New Thing

    Why is hybrid sleep off by default on laptops? (and how do I turn it on?)

    • 27 Comments

    Hybrid sleep is a type of sleep state that combines sleep and hibernate. When you put the computer into a hybrid sleep state, it writes out all its RAM to the hard drive (just like a hibernate), and then goes into a low power state that keeps RAM refreshed (just like a sleep). The idea is that you can resume the computer quickly from sleep, but if there is a power failure or some other catastrophe, you can still restore the computer from hibernation.

    A hybrid sleep can be converted to a hibernation by simply turning off the power. By comparison, a normal sleep requires resuming the computer to full power in order to write out the hibernation file. Back in the Windows XP days, I would sometimes see the computer in the next room spontaneously turn itself on: I'm startled at first, but then I see on the screen that the system is hibernating, and I understand what's going on.

    Hybrid sleep is on by default for desktop systems but off by default on laptops. Why this choice?

    First of all, desktops are at higher risk of the power outage scenario wherein a loss of power (either due to a genuine power outage or simply unplugging the computer by mistake) causes all work in progress to be lost. Desktop computers typically don't have a backup battery, so a loss of power means instant loss of sleep state. By comparison, laptop computers have a battery which can bridge across power outages.

    Furthermore, laptops have a safety against battery drain: When battery power gets dangerously low, it can perform an emergency hibernate.

    Laptop manufacturers also requested that hybrid sleep be off by default. They didn't want the hard drive to be active for a long time while the system is suspending, because when users suspend a laptop, it's often in the form of "Close the lid, pick up the laptop from the desk, throw it into a bag, head out." Performing large quantities of disk I/O at a moment when the computer is physically being jostled around increases the risk that one of those I/O's will go bad. This pattern doesn't exist for desktops: When you suspend a desktop computer, you just leave it there and let it do its thing.

    Of course, you can override this default easily from the Control Panel. Under Power Options, select Change plan settings, then Changed advanced power settings, and wander over into the Sleep section of the configuration tree.

    If you're a command line sort of person, you can use this insanely geeky command line to enable hybrid sleep when running on AC power in Balanced mode:

    powercfg -setacvalueindex 381b4222-f694-41f0-9685-ff5bb260df2e
                              238c9fa8-0aad-41ed-83f4-97be242c8f20
                              94ac6d29-73ce-41a6-809f-6363ba21b47e 1
    

    (All one line. Take a deep breath.) [Update: Or you can use powercfg -setacvalueindex SCHEME_BALANCED SUB_SLEEP HYBRIDSLEEP 1, as pointed out by Random832. I missed this because the ability to substitute aliases is not mentioned in the -setacvalueindex documentation. You have to dig into the -aliases documentation to find it.]

    Okay, what do all these insane options mean?

    -setacvalueindex sets the behavior when running on AC power. To change the behavior when running on battery, use -setdcvalueindex instead. Okay, that was easy.

    The next part is a GUID, specifically, the GUID that represents the balanced power scheme. If you want to modify the setting for a different power scheme, then substitute that scheme's GUID.

    After the scheme GUID comes the subgroup GUID. Here, we give the GUID for the Sleep subgroup.

    Next we have the GUID for the Hybrid Sleep setting.

    Finally, we have the desired new value for the setting. As you might expect, 1 enables it and 0 disables it.

    And where did these magic GUIDs come from? Run the powercfg -aliases command to see all the GUIDs. You can also run powercfg -q to view all the settings and their current values in the current power scheme.

    Bonus reading:

  • The Old New Thing

    Looking at the world through kernel-colored glasses

    • 14 Comments

    During a discussion of the proper way of cancelling I/O, the question was raised as to whether it was safe to free the I/O buffer, close the event handle, and free the OVERLAPPED structure immediately after the call to CancelIo. The response from the kernel developer was telling.

    That's fine. We write back to the buffer under a try/except, so if the memory is freed, we'll just ignore it. And we take a reference to the handle, so closing it does no harm.

    These may be the right answers from a kernel-mode point of view (where the focus is on ensuring that consistency in kernel mode is not compromised), but they are horrible answers from an application point of view: Kernel mode will write back to the buffer and the OVERLAPPED when the I/O completes, thereby corrupting user-mode memory if user-mode had re-used the memory for some other purpose. And if the handle in the OVERLAPPED structure is closed, then user mode has lost its only way of determining when it's safe to continue! You had to look beyond the literal answer to see what the consequences were for application correctness.

    (You can also spot the kernel-mode point of view in the clause "if the memory is freed." The developer is talking about freed from kernel mode's point of view, meaning that it has been freed back to the operating system and is no longer committed in the process address space. But memory that is logically freed from the application's point of view may not be freed back to the kernel. It's usually just freed back into the heap's free pool.)

    The correct answer is that you have to wait for the I/O to complete before you free the buffer, close the event handle, or free the OVERLAPPED structure.

    Don't fall into this trap. The kernel developer was looking at the world through kernel-colored glasses. But you need to look at the situation from the perspective of your customers. When the kernel developer wrote "That's fine", he meant "That's fine for me." Sucks to be you, though.

    It's like programming an autopilot to land an airplane, but sending it through aerobatics that kill all the passengers. If you ask the autopilot team, they would say that they accomplished their mission: Technically, the autopilot did land the airplane.

    Here's another example of kernel-colored glasses. And another.

    Epilogue: To be fair, after I pointed out the kernel-mode bias in the response, the kernel developer admitted, "You're right, sorry. I was too focused on the kernel-mode perspective and wasn't looking at the bigger picture."

  • The Old New Thing

    Why is my program terminating with exit code 3?

    • 20 Comments

    There is no standard for process exit codes. You can pass anything you want to Exit­Process, and that's what Get­Exit­Code­Process will give back. The kernel does no interpretation of the value. If youw want code 42 to mean "Something infinitely improbable has occurred" then more power to you.

    There is a convention, however, that an exit code of zero means success (though what constitutes "success" is left to the discretion of the author of the program) and a nonzero exit code means failure (again, with details left to the discretion of the programmer). Often, higher values for the exit code indicate more severe types of failure. The command processor ERROR­LEVEL keyword was designed with these convention in mind.

    There are cases where your process will get in such a bad state that a component will take it upon itself to terminate the process. For example, if a process cannot locate the DLLs it imports from, or one of those DLLs fails to initialize, the loader will terminate the process and use the status code as the process exit code. I believe that when a program crashes due to an unhandled exception, the exception code is used as the exit code.

    A customer was seeing their program crash with an exit code of 3 and couldn't figure out where it was coming from. They never use that exit code in their program. Eventually, the source of the magic number 3 was identified: The C runtime abort function terminates the process with exit code 3.

  • The Old New Thing

    Just for fun: Sample user names in Windows 7

    • 45 Comments

    There are a few places in Windows where you are asked to enter your name in order to set up an account. Just for fun, I went through all the localized versions I could find and extracted the sample names. Some locales did not get around to translating all the strings. If the string was left untranslated (which can happen for LIPs), then I left the box blank. (Locales which did not translate either string have been omitted from the table.)

    My reactions after the table.

    ID English name for example, John For example: John Smith
    af-za Afrikaans byvoorbeeld, John Voorbeeld: Jan Smit
    am-et Amharic ለምሳሌ: አበበ ከበደ
    ar-sa Arabic على سبيل المثال، أمجد على سبيل المثال: أشرف ماهر
    as-in Assamese উদাহৰণৰ কাৰণে, প্ৰবিন উদাহৰণ স্বৰূপে: জন স্মিথ
    az-latn-az Azerbaijani məsələn, Fərhad Məsələn: Vüsal Tahirov
    bg-bg Bulgarian например Kiril Например: Пламен Христов
    bn-bd Bengali (Bangladesh) উদাহরণস্বরূপ, জন উদাহরণস্বরূপ: জন স্মিথ
    bn-in Bengali (India) উদাহরণস্বরূপ, রাম উদাহরণস্বরূপ: জন স্মিথ
    bs-cyrl-ba Bosnian (Cyrillic) напримјер John Напримјер: Алмир Алмировић
    bs-latn-ba Bosnian (Latin) naprimjer John Naprimjer: Almir Almirović
    ca-es Catalan per exemple, Jordi Per exemple: Pau Solà
    cs-cz Czech například Tereza Příklad: Jan Novák
    cy-gb Welsh Siôn, er enghraifft Er enghraifft: Siân Jones
    da-dk Danish f.eks. Claus Eksempel: Jens Jensen
    de-de German N/A Beispiel: Jens Mander
    el-gr Greek για παράδειγμα, Γιάννης Για παράδειγμα: Γεώργιος Βασιλείου
    en-us English for example, John For example: John Smith
    es-es Spanish por ejemplo, Juan Por ejemplo: Jorge López
    et-ee Estonian näiteks Jaan Näiteks: Mati Kask
    eu-es Basque esaterako, Rafa Adibidez: Ane Lizarralde
    fa-ir Persian برای مثال، John
    fi-fi Finnish esimerkiksi Juha Esimerkki: Henri Rautiainen
    fil-ph Filipino halimbawa, Juan Halimbawa: Juan dela Cruz
    fr-fr French par exemple Rosalie Par exemple : Marie Dubois
    ga-ie Irish mar shampla, Seán Mar shampla: Seán Ó Murchú
    gl-es Galician por exemplo: Xiana Por exemplo: Duarte Vidal
    gu-in Gujarati ઉદાહરણ તરીકે, કમલેશ ઉદાહરણ તરીકે: કમલેશ દવે
    ha-latn-ng Hausa A misali:John Smith
    he-il Hebrew לדוגמה, John לדוגמה: משה כהן
    hi-in Hindi उदाहरण के लिए, अमित उदाहरण के लिए: जनमेजय सिंह सिकरवार
    hr-hr Croatian na primjer, Zdenko Primjerice: Ivan Kovač
    hu-hu Hungarian például Lilian Például: Tót Béla
    hy-am Armenian օրինակ՝ Արամ Օրինակ. Արմեն Արմենյան
    id-id Indonesian misalnya, John Misalnya: John Smith
    ig-ng Igbo iji maatụ, Chukwubike-ụgbaja Ọmụmatụ̀: Ụgbaja Chukwubuike Greg
    is-is Icelandic til dæmis Jón Dæmi: Jón Jónsson
    it-it Italian ad esempio, Luca Ad esempio: Valeria Dal Monte
    iu-latn-ca Inuktitut Suurlu: John Smith
    ja-jp Japanese 例: John 例: Taro Chofu
    ka-ge Georgian მაგ. Nino მაგალითად: დიმიტრი გოგელია
    kk-kz Kazakh мысалы, Джон Мысалы: Аманбайқызы Айнұр
    km-kh Khmer ឧទាហរណ៍ John ឧទាហរណ៍: John Smith
    kn-in Kannada ಉದಾಹರಣೆಗೆ, ಜಾನ್ ಉದಾಹರಣೆಗೆ: ಜಾನ್ ಸ್ಮಿತ್
    ko-kr Korean 예: John 예: 홍길동
    kok-in Konkani देखीक, जॉन देखीक: जॉन स्मिथ
    ky-kg Kirghiz Мисалы: Тилек Чубаков
    lb-lu Luxembourgish z. Bsp., John Zum Beispill: Marc Majerus
    lt-lt Lithuanian pvz., Jonas Pvz., Jonas Jonaitis
    lv-lv Latvian piemēram, Jānis Piemēram: Jānis Zariņš
    mi-nz Māori Hei tauira: Hone Mete
    mk-mk Macedonian на пример, Зоран На пример: Бранко Стојановски
    ml-in Malayalam ഉദാ: ജോണ് ഉദാഹരണമായി: John Smith
    mr-in Marathi उदाहरणासाठी, जॉन उदाहरणार्थ: जॉन स्मिथ
    ms-bn Malay (Brunei) sebagai contoh, John Sebagai contoh: John Smith
    ms-my Malay (Malaysia) sebagai contoh, John Sebagai contoh: John Smith
    mt-mt Maltese pereżempju, John Eżempju: John Pace
    nb-no Norwegian (Bokmål) for eksempel Kim Eksempel: Jens Jensen
    ne-np Nepali उदाहरणार्थ, जोन उदाहरणको निमित्त: राम बहादुर
    nl-nl Dutch bijvoorbeeld: Emma Bijvoorbeeld: Jan Smit
    nn-no Norwegian (Nynorsk) for eksempel Kim Eksempel: Jens Jensen
    nso-za Northern Sotho go fa mohlala, John Mohlala: John Smith
    or-in Oriya ଉଦାହରଣ ସ୍ଵରୂପ, ଜୋନ୍ ଉଦାହରଣ ସ୍ଵରୂପ: ଦିପ୍ତି ରଞ୍ଜନ
    pa-in Panjabi ਉਦਾਹਰਣ ਦੇ ਲਈ, ਜੌਨ ਉਦਾਹਰਣ ਦੇ ਲਈ: John Smith
    pl-pl Polish na przykład: Tomek Na przykład: Jan Kowalski
    pt-br Portuguese (Brazil) por exemplo, Marcio Por exemplo: João Silva
    pt-pt Portuguese (Portugal) por exemplo: Rui Por exemplo: Jorge Santos
    qps-ploc Pseudo ƒŏг єжåмþľę, Јσĥň [sqymS][₣óя εхдmþĺĕ: Јǿћη Šмīťђ !!! !!]
    qps-mirr Pseudo (Mirrored) [For example: John Smith]
    quz-pe Quechua qatina, Juan Kay hina: Jorge Lopez
    ro-ro Romanian Ion, de exemplu De exemplu: Ion Popescu
    ru-ru Russian например, Андрей Например: Иван Петров
    si-lk Sinhala නිදසුන් ලෙස, නිමල් නිදසුනක් ලෙස: Don Lasith
    sk-sk Slovak napríklad Ján Príklad: Peter Kováč
    sl-si Slovene na primer Janez Na primer: Janez Kranjski
    sq-al Albanian për shembull, Vehbi Për shembull: Vehbi Neziri
    sr-cyrl-cs Serbian (Cyrillic) на пример, Јован На пример: Петар Петровић
    sr-latn-cs Serbian (Latin) na primer, Jovan Na primer: Petar Petrović
    sv-se Swedish t.ex. Rebecca Exempel: John Smith
    sw-ke Swahili kwa mfano, Yohana Kwa mfano: Mussa Joseph
    ta-in Tamil எடுத்துக்காட்டாக, ஜான் உதாரணத்திற்கு: குமார்
    te-in Telugu ఉదాహరణకు, వేణు ఉదాహరణకు: రామ్ లక్ష్మణ్
    th-th Thai ตัวอย่างเช่น John ตัวอย่างเช่น: John Smith
    tn-za Tswana sekai, Tidimalo Sekao jaaka: P‌ule Molefe
    tr-tr Turkish örneğin, Can Örneğin: Kemal Etikan
    tt-ru Tatar мәсәлән, Фәрит Мәсәлән: Гали Вәлиев
    uk-ua Ukrainian наприклад, Тарас Наприклад: Тарас Руденко
    ur-pk Urdu مثال کے طور پر, امجد مثال کے طور پر: صفدر رشيد
    uz-latn-uz Uzbek masalan, Akmal Masalan: Adham Soliyev
    vi-vn Vietnamese ví dụ, John Ví dụ: John Smith
    yo-ng Yoruba bí àpẹẹrẹ, Jòhánù Bí àpẹẹrẹ: John Smith
    zh-cn Chinese (PRC) 例如: John 例如: 李建国
    zh-hk Chinese (Hong Kong) 例如,John
    zh-tw Chinese (Taiwan) 例如,John 範例: 祝英台
    zu-za Zulu isibonelo, John Isibonelo: John Smith

    Some observations:

    • Many languages translated the words "for example" but left the name as John or John Smith. I'm looking at you, Sweden. "John Smith"? Really? You couldn't have changed it to Sven Svensson?
    • Some languages chose generic names (like Jan Novák), keeping to the spirit of the English sample name. Others chose to substitute a real name (like Marie Dubois). [Update: See correction from Hardt.]
    • German doesn't provide a sample first name. My guess is that they ran out of room! The string Geben Sie einen Benutzernamen ein: probably took up all the space in the dialog, leaving no room for an example. [Update: See explanation of name Jens Mander from Roland.]
    • This information (and plenty of other translation goodness) is publically available for non-commercial use.

    Related: The Locales of Windows 7, all divvied up.

  • The Old New Thing

    My evil essence revealed

    • 21 Comments

    I found it amusing that somebody considered the fact that Microsoft employees can read my queued-up blog entries before the articles are published to be further evidence of Microsoft's evil essence as a monopoly.

    Just for the record, this is not evidence of Microsoft's evil essence as a monopoly. Rather, it's evidence of Raymond's evil essence as a monopoly, because the monopoly on blog articles written by Raymond Chen that haven't yet been published belongs to me.

  • The Old New Thing

    Why does Explorer show a thumbnail for my image that's different from the image?

    • 21 Comments

    A customer (via a customer liaison) reported that Explorer somestimes showed a thumbnail for an image file that didn't exactly match the image itself.

    I have an image that consists of a collage of other images. When I switch Explorer to Extra Large Icons mode, the thumbnail is a miniature representation of the image file. But in Large Icons and Medium Icons mode, the thumbnail image shows only one of the images in the collage. I've tried deleting the thumbnail cache, but that didn't help; Explorer still shows the wrong thumbnails for the smaller icon modes. What is wrong?

    The customer provided screenshots demonstrating the problem, but the customer did not provide the image files themselves that were exhibiting the problem. I therefore was reduced to using my psychic powers.

    My psychic powers tell me that your JPG file has the single-item image as the camera-provided thumbnail. The shell will use the camera-provided thumbnail if suitable.

    The customer liaison replied,

    The customer tells me that the problem began happening after they edited the images. Attached is one of the images that's demonstrating the problem.

    Some image types (most notable TIFF and JPEG) support the EXIF format for encoding image metadata. This metadata includes information such as the model of camera used to take the picture, the date the picture was taken, and various camera settings related to the photograph. But the one that's interesting today is the image thumbnail.

    When Explorer wants to display a thumbnail for an image, it first checks whether the image comes with a precalculated thumbnail. If so, and the thumbnail is at least as large as the thumbnail Explorer wants to show, then Explorer will use the image-provided thumbnail instead of creating its own from scratch. If the thumbnail embeded in the image is wrong, then when Explorer displays the image-provided thumbnail, the result will be incorrect. Explorer has no idea that the image is lying to it.

    Note that the decision whether to use the image-provided thumbnail is not based solely on the view. (In other words, the conclusion is not "Explorer uses the image-provided thumbnail for Large Icons and Medium Icons but ignores it for Extra Large Icons.) The decision is based on both the view and the size of the image-provided thumbnail. If the image-provided thumbnail is at least the size of the view, then Explorer will use it. For example, if your view is set to 64 × 64 thumbnails, then the image-provided thumbnail will be used if it is at least 64 × 64.

    The Wikipedia page on EXIF points out that "Photo manipulation software sometimes fails to update the embedded information after an editing operation." It appears that some major image editing software packages fail to update the EXIF thumbnail when an image is edited, which can result in inadvertent information disclosure: If the image was cropped or otherwise altered to remove information, the information may still linger in the thumbnail. This Web site has a small gallery of examples.

  • The Old New Thing

    Why don't the file timestamps on an extracted file match the ones stored in the ZIP file?

    • 35 Comments

    A customer liaison had the following question:

    My customer has ZIP files stored on a remote server being accessed from a machine running Windows Server 2003 and Internet Explorer Enhanced Security Configuration. When we extract files from the ZIP file, the last-modified time is set to the current time rather than the time specified in the ZIP file. The problem goes away if we disable Enhanced Security Configuration or if we add the remote server to our Trusted Sites list. We think the reason is that if the file is in a non-trusted zone, the ZIP file is copied to a temporary location and is extracted from there, and somehow the date information is lost.

    The customer is reluctant to turn off Enhanced Security Configuration (which is understandable) and doesn't want to add the server as a trusted site (somewhat less understandable). Their questions are

    • Why is the time stamp changed during the extract? If we copy the ZIP file locally and extract from there, the time stamp is preserved.
    • Why does being in an untrusted zone affect the behavior?
    • How can we avoid this behavior without having to disable Enhanced Security Configuration or adding the server as a trusted site?

    The customer has an interesting theory (that the ZIP file is copied locally) but it's the wrong theory. After all, copying the ZIP file locally doesn't modify the timestamps stored inside it.

    Since the ZIP file is on an untrusted source, a zone identifier is being applied to the extracted file to indicate that the resulting file is not trustworthy. This permits Explorer to display a dialog box that says "Do you want to run this file? It was downloaded from the Internet, and bad guys hang out there, bad guys who try to give you candy."

    And that's why the last-modified time is the current date: Applying the zone identifier to the extracted file modifies its last-modified time, since the file on disk is not identical to the one in the ZIP file. (The one on disk has the "Oh no, this file came from a stranger with candy!" label on it.)

    The recommended solution is to add the server containing trusted ZIP files to your trusted sites list. Since the customer is reluctant to do this (for unspecified reasons), there are some other alternatives, though they are considerably riskier. (These alternatives are spelled out in KB article 883260: Description of how the Attachment Manager works.)

    You can disable the saving of zone information from the Group Policy Editor, under Administrative Templates, Windows Components, Attachment Manager, Do not preserve zone information in file attachments. This does mean that users will not be warned when they attempt to use a file downloaded from an untrusted source, so you have to trust your users not to execute that random executable they downloaded from some untrusted corner of the Internet.

    You can use the Inclusion list for low, moderate, and high risk file types policy to add ZIP as a low-risk file type. This is not quite as drastic as suppressing zone information for all files, but it means that users who fall for the "Please unpack the attached ZIP file and open the XYZ icon" trick will not receive a "Do you want to eat this candy that a stranger gave to you?" warning prompt before they get pwned.

    But like I said, it's probably best to add just the server containing the trusted ZIP files to your trusted sites list. If the server contains both trusted and untrusted data (maybe that's why the customer doesn't want to put it on the trusted sites list), then you need to separate the trusted data from the untrusted data and put only the trusted server's name in your trusted sites list.

  • The Old New Thing

    A function pointer cast is a bug waiting to happen

    • 35 Comments

    A customer reported an application compatibility bug in Windows.

    We have some code that manages a Win32 button control. During button creation, we subclass the window by calling Set­Window­Subclass. On the previous version of Windows, the subclass procedure receives the following messages, in order:

    • WM_WINDOWPOSCHANGING
    • WM_NCCALCSIZE
    • WM_WINDOWPOSCHANGED

    We do not handle any of these messages and pass them through to Def­Subclass­Proc. On the latest version of Windows, we get only the first two messages, and comctl32 crashes while it's handling the third message before it gets a chance to call us. It looks like it's reading from invalid memory.

    The callback function goes like this:

    LRESULT ButtonSubclassProc(
        HWND hwnd,
        UINT uMsg,
        WPARAM wParam,
        LPARAM lParam,
        UINT_PTR idSubclass,
        DWORD_PTR dwRefData);
    

    We install the subclass function like this:

    SetWindowSubclass(
        hwndButton,
        reinterpret_cast<SUBCLASSPROC>(ButtonSubclassProc),
        id,
        reinterpret_cast<DWORD_PTR>(pInfo));
    

    We found that if we changed the callback function declaration to

    LRESULT CALLBACK ButtonSubclassProc(
        HWND hwnd,
        UINT uMsg,
        WPARAM wParam,
        LPARAM lParam,
        UINT_PTR idSubclass,
        DWORD_PTR dwRefData);
    

    and install the subclass function like this:

    SetWindowSubclass(
        hwndButton,
        ButtonSubclassProc,
        id,
        reinterpret_cast<DWORD_PTR>(pInfo));
    

    then the problem goes away. It looks like the new version of Windows introduced a compatibility bug; the old code works fine on all previous versions of Windows.

    Actually, you had the problem on earlier versions of Windows, too. You were just lucky that the bug wasn't a crashing bug. But now it is.

    This is a classic case of mismatching the calling convention. The SUB­CLASS­PROC function is declared as requiring the CALLBACK calling convention (which on x86 maps to __stdcall), but the code declared it without any calling convention at all, and the ambient calling convention was __cdecl. When they went to compile the code, they got a compiler error that said something like this:

    error C2664: 'SetWindowSubclass' : cannot convert parameter 2 from 'LRESULT (__cdecl *)(HWND,UINT,WPARAM,LPARAM,UINT_PTR,DWORD_PTR)' to 'SUBCLASSPROC'

    "Since the compiler was unable to convert the parameter, let's give it some help and stick a cast in front. There, that shut up the compiler. Those compiler guys are so stupid. They can't even figure out how to convert one function pointer to another. I bet they need help wiping their butts when they go to the bathroom."

    And there you go, you inserted a cast to shut up the compiler and masked a bug instead of fixing it.

    The only thing you can do with a function pointer after casting it is to cast it back to its original type.¹ If you try to use it as the cast type, you will crash. Maybe not today, maybe not tomorrow, but someday.

    In this case, the calling convention mismatch resulted in the stack being mismatched when the function returns. It looks like earlier versions of Windows managed to hobble along long enough before things got resynchronized (by an EBP frame restoration, most likely) so the damage didn't spread very far. But the new version of Windows, possibly one compiled with more aggressive optimizations, ran into trouble before things resynchronized, and thus occurred the crash.

    The compiler was yelling at you for a reason.

    It so happened that the Windows application compatibility team had already encountered this problem in their test labs, and a shim had already been developed to auto-correct this mistake. (Actually, the shim also corrects another mistake they hadn't noticed yet: They forgot to call Remove­Window­Subclass when they were done.)

    ¹I refer here to pointers to static functions. Pointers to member functions are entirely different animals.

  • The Old New Thing

    If it's possible to do something, then it's possible to do something WRONG

    • 22 Comments

    Once you make it possible to do something, you have to accept that you also made it possible to do something wrong.

    When the window manager was originally designed, it made it possible for programs to override many standard behaviors. They could handle the WM_NC­HIT­TEST message so a window can be dragged by grabbing any part of the window, not just the caption bar. They could handle the WM_NC­PAINT message to draw custom title bars. The theory was that making all of these things possible permitted smart people to do clever things.

    The downside is that it also permits stupid people to do dumb things.

    Changing the window procedure model from call Def­Window­Proc to get default behavior to return whether you handled the message wouldn't have helped. First of all, the handled/not-handled model is too restrictive: It requires you to do everything (handled) or nothing (not handled). There is no option to do a little bit. (Imagine if C++ didn't let you call the base class implementation of an overridden method.)

    Doing a little bit is a very common pattern. The WM_NC­HITTEST technique mentioned above, for example, uses the default hit-testing implementation, and then tweaks the result slightly:

    case WM_NCHITTEST:
     // call base class first
     lres = DefWindowProc(hwnd, uMsg, wParam, lParam);
     // tweak the result
     if (lres == HTCLIENT) lres = HTCAPTION;
     return lres;
    

    How would you do this with the handled/not-handled model?

    case WM_NCHITTEST:
     if (not handling this message would have resulted in HTCLIENT) {
      lres = HTCAPTION;
      handled = TRUE;
     } else {
      handled = FALSE;
     }
     break;
    

    The trick about that bit in parentheses is that it requires the research department to finish the final details on that time machine they've been working on. It's basically saying, "Return not handled, then follow the message until handling is complete and if the final result is HTCLIENT, then fire up the time machine and rewind to this point so I can change my mind and return handled instead."

    And even if the research department comes through with that time machine, the handled/not-handled model doesn't even solve the original problem!

    The original problem was people failing to call Def­Window­Proc when they decided that they didn't want to handle a message. In the handled/not-handled model, the equivalent problem would be people returning handled = TRUE unconditionally.

    BOOL NewStyleWindowProc(HWND hwnd, UINT uMsg,
     WPARAM wParam, LPARAM lParam, LRESULT& lres)
    {
     BOOL handled = TRUE;
     switch (uMsg) {
     case WM_THIS: ...; break;
     case WM_THAT: ...; break;
     // no "default: handled = FALSE; break;"
     }
     return handled;
    }
    

    (Side note: The dialog manager uses the handled/not-handled model, and some people would prefer that it use the Def­Xxx­Proc model, so you might say "We tried that, and some people didn't like it.")

    This topic raises another one of those "No matter what you do, somebody will call you an idiot" dilemmas. On the one side, there's the Windows should perform extra testing at runtime to detect bad applications school, and on the other side, there's the Windows should get rid of all the code whose sole purpose in life is to detect bad applications school.

  • The Old New Thing

    WinMain is just the conventional name for the Win32 process entry point

    • 35 Comments

    WinMain is the conventional name for the user-provided entry point in a Win32 program. Just like in 16-bit Windows, where the complicated entry point requirements were converted by language-provided startup code into a call to the the user's WinMain function, the language startup code for 32-bit programs also does the work of converting the raw entry point into something that calls WinMain (or wWinMain or main or _wmain).

    The raw entry point for 32-bit Windows applications has a much simpler interface than the crazy 16-bit entry point:

    DWORD CALLBACK RawEntryPoint(void);
    

    The operating system calls the function with no parameters, and the return value (if the function ever returns) is passed to the ExitThread function. In other words, the operating system calls your entry point like this:

    ...
      ExitThread(RawEntryPoint());
      /*NOTREACHED*/
    

    Where do the parameters to WinMain come from, if they aren't passed to the raw entry point?

    The language startup code gets them by asking the operating system. The instance handle for the executable comes from GetModuleHandle(NULL), the command line comes from GetCommandLine, and the nCmdShow comes from GetStartupInfo. (As we saw before, the hPrevInstance is always NULL.)

    If you want to be hard-core, you can program to the raw entry point. Mind you, other parts of your program may rely upon the work that the language startup code did before calling your WinMain. For example, the C++ language startup code will run global constructors before calling into WinMain, and both C and C++ will initialze the so-called security cookie used as part of stack buffer overrun detection. Bypass the language startup code at your peril.

    Bonus chatter: Notice that if you choose to return from your entry point function, the operating system passes the return value to ExitThread and not ExitProcess. For this reason, you typically don't want to return from your raw entry point but instead want to call ExitProcess directly. Otherwise, if there are background threads hanging around, they will prevent your process from exiting.

Page 1 of 3 (26 items) 123