Other

  • The Old New Thing

    Under what conditions can SetFocus crash? Another debugging investigation

    • 10 Comments

    A customer asked, "Under what conditions can Set­Focus crash?"

    We have been going through our error reports and are puzzled by this one. The call stack is as follows:

    user32!_except_handler4
    ntdll!ExecuteHandler2@20
    ntdll!ExecuteHandler@20
    ntdll!RtlDispatchException
    ntdll!_KiUserExceptionDispatcher@8
    0x130862
    user32!UserCallWinProcCheckWow
    user32!__fnDWORD
    ntdll!_KiUserCallbackDispatcher@12
    user32!NtUserSetFocus
    contoso!DismissPopup
    

    At the point of the crash, the Dismiss­Popup function is calling Set­Focus to restore focus to a window handle that we got from an earlier call to Get­Active­Window. Is this safe? We imagine it might crash if the message handler for the window was unloaded from memory without being properly unregistered; are there any other reasons? More to the point, is there any way to avoid the problem (without fixing the root cause of the crash, which we may not be able to do, e.g. if that window was created by third-party code)?

    The full dump file can be found on <location>. The password is <xyzzy>.

    Indeed, what the customer suspected is what happened, confirmed by the dump file provided.

    The code behind the window procedure got unloaded. User­Call­Win­Proc­Check­Wow is trying to call the window procedure, but instead it took an exception. The address doesn't match any loaded or recently-unloaded module probably because it was a dynamically generated thunk, like the ones ATL generates.

    There isn't much you can do to defend against this. Even if you manage to detect the problem and avoid calling Set­Focus in this problematic case, all you're doing is kicking the can further down the road. Your program will crash the next time the window receives a message, which it eventually will. (For example, the next time the user changes a system setting and the WM_SETTING­CHANGE message is broadcast to all top-level windows, or the user plugs in an external monitor and the WM_DISPLAY­CHANGE message is broadcast to all top-level windows.)

    Basically, that other component pulled the pin on a grenade and handed it to your thread. That grenade is going to explode sooner or later. The only question is when.

    Such is the danger of giving your application an extension model that allows arbitrary third party code to run. The third party code can do good things to make your program more useful, but it can also do bad things to make your program crash.

  • The Old New Thing

    When designing your user interface, be mindful of the selection-readers

    • 47 Comments

    Occasionally, there will be a feature along the lines of "Whenever the user selects some text, we will pop up an X." And then I have to remind them about so-called selection readers.

    Selection readers are people who habitually select text in a document as they read it. It can be quite maddening looking over the shoulder of a selection reader, because you will see seemingly-random fragments of text get selected, and for those who are selection-deselection readers, the text will almost immediately get cleared. It's like the entire document is blinking with selections. (Other variations of the selection reader are the double-click reader who double-clicks words on the page, the margin-click reader who clicks in the margin, and the hover reader who merely hovers the mouse without clicking.)

    There are a number of theories behind why some people are selection readers.

    • It's a nervous habit to keep one's fingers occupied, similar to spinning a pen.
    • It marks one's place in the document.
    • It gives a sense of accomplishment as one progresses through the document.
    • It helps the eye follow the reading location in the document during a scroll operation.

    I am not a selection reader, but I do click in the document with some regularity. I do this for two reasons.

    1. To give focus to the document area, so that scrolling the mouse wheel or hitting PgDn will scroll the document text.
    2. To place the caret inside the viewport.

    The second reason needs some explanation.

    The caret is the blinking line that shows where your next typed character will be inserted. You can scroll the document so much that the caret goes out of the viewport. For example, if you never click in the document but merely scroll through it, the caret will be at the top of the document, even though you are reading page 25 of 50.

    And then you hit PgDn thinking that you're scrolling down one screen, but instead you're going to the middle of page one. Congratulations, you just lost your place and jumped backward 24 pages.

    Furthermore, there are some programs which are really twitchy about the caret. If you manage to scroll the caret off screen, they will say, "Sure, go ahead and scroll the caret off screen," but then you breathe on the program funny (say, by switching to another program, then switch back with Alt+Tab), and it says, "Whoa, your caret is waaaaay off screen! Let me help you by scrolling the caret back on screen. No need to thank me. Just helping out."

    Of course, what those programs ended up doing is ripping me from page 25 back to page one.

    That's why I consciously click in the document a few times after every scroll operation. It's not yet a habitual operation, so I will sometimes forget, and then just my luck, that's the time I accidentally hit PgDn or or Alt+Tab and get teleported backward in the document.

  • The Old New Thing

    Dubious security vulnerability: Luring somebody into your lair

    • 31 Comments

    A security report was received that went something like this:

    The XYZ application does not load its DLLs securely. Create a directory, say, C:\Vulnerable, and copy XYZ.EXE and a rogue copy of ABC.DLL in that directory. When C:\Vulnerable\XYZ.EXE is run, the XYZ program will load the rogue DLL instead of the official copy in the System32 directory. This is a security flaw in the XYZ program.

    Recall that the directory is the application bundle, The fact that the XYZ.EXE program loads ABC.DLL from the application directory rather than the System32 directory is not surprising because the ABC.DLL has been placed inside the XYZ.EXE program's trusted circle.

    But what is the security flaw, exactly?

    Let's identify the attacker, the victim, and the attack scenario.

    The attacker is the person who created the directory with the copy of XYZ.EXE and the rogue ABC.DLL.

    The victim is whatever poor sap runs the XYZ.EXE program from the custom directory instead of from its normal location.

    The attack scenario is

    • Attacker creates a directory, say, C:\Vulnerable.
    • copy C:\Windows\System32\XYZ.EXE C:\Vulnerable\XYZ.EXE
    • copy rogue.dll C:\Vulnerable\ABC.DLL
    • Convince a victim to run C:\Vulnerable\XYZ.EXE.

    When the victim runs C:\Vulnerable\XYZ.EXE, the rogue DLL gets loaded, and the victim is pwned.

    But the victim was already pwned even before getting to that point! Because the victim ran C:\Vulnerable\XYZ.EXE.

    A much simpler attack is to do this:

    • Attacker creates a directory, say, C:\Vulnerable.
    • copy pwned.exe C:\Vulnerable\XYZ.EXE
    • Convince a victim to run C:\Vulnerable\XYZ.EXE.

    The rogue ABC.DLL is immaterial. All it does is crank up the degree of difficulty without changing the fundamental issue: If you can trick a user into running a program you control, then the user is pwned.

    This is another case of if I can run an arbitrary program, then I can do arbitrary things, also known as MS07-052: Code execution results in code execution.

    Note that the real copy of XYZ.EXE in the System32 directory is unaffected. The attack doesn't affect users which run the real copy. And since C:\Vulnerable isn't on the default PATH, the only way to get somebody to run the rogue copy is to trick them into running the wrong copy.

    It's like saying that there's a security flaw in Anna Kournikova because people can create things that look like Anna Kournikova and trick victims into running it.

  • The Old New Thing

    If you can set enforcement for a rule, you can set up lack of enforcement

    • 9 Comments

    One of the things you can do with an internal tool I've been calling Program Q is run a program any time somebody wants to add or modify a record. The program has wide latitude in what it can do. It can inspect the record being added/modified, maybe record side information in another table, and one of the things it can decide to do is to reject the operation.

    We have set up a validator in our main table to ensure that the widget being added or modified is priced within the approver's limit. But sometimes, there is an urgent widget request and we want to be able to bypass the validation temporarily. Is there a way to disable the validator just for a specific record, or to disable it for all records temporarily?

    If you can set up a program to validate a record, you can also set up a program to not validate a record.

    Suppose your current validator for adding a widget goes like this:

    if (record.approver.limit < record.price) {
     record.Reject("Price exceeds approver's limit");
     return;
    }
    ... other tests go here ...
    

    And say you want to be able to allow emergency requests to go through even though, say, all approvers are unavailable. Because, maybe, the widget is on fire.

    You could decide that a widget whose description begins with the word EMERGENCY is exempt from all validation, but it generates email to a special mailing list.

    if (record.description.beginsWith("EMERGENCY"))
    {
     // emergency override: send email
     // and bypass the rest of validation
     generateNotificationEmail(record);
     return;
    }
    if (record.approver.limit < record.price) {
     record.Reject("Price exceeds approver's limit");
     return;
    }
    ... other tests go here ...
    

    Of course, the EMERGENCY rule was completely arbitrary. You can come up with whatever rules you like. The point is: If you wrote the rules, you can also write the rules so that they have exceptions.

  • The Old New Thing

    Documentation creates contract, which is why you need to be very careful what you document

    • 40 Comments

    A person with a rude name asks, "Why does MS not document the system metrics used by classic/pre-uxtheme windows and common controls? This image is really useful and I wish all of this was actually documented."

    Actually, that picture explains why it isn't documented.

    Suppose such a picture existed in the Windows 2000 documentation. I don't know what it would say exactly, so suppose, for the purpose of discussion, that it said that the caption buttons are exactly SM_CX­FRAME pixels from the right-hand edge of the window, and that the buttons are exactly SM_CX­SIZE pixels wide, with exactly SM_CX­EDGE pixels of padding between the buttons, and the buttons are exactly SM_CY­SIZE pixels tall, with SM_CY­EDGE pixels between the top of the button and the top of the window.

    Once that picture existed in the documentation, the picture you linked to could never exist.

    The picture from Windows 2000 doesn't include the SM_CX­PADDED­BORDER or the the SM_CY­PADDED­BORDER. It can't, because those metrics didn't exist in Windows 2000. Since the diagram is part of the documentation, it is contractual, and it would not be possible to alter the layout of the window caption (say, by incorporating a new metric like SM_CX­PADDED­BORDER), because that would break existing code.

    For example, a program may have looked at the diagram and concluded, "Okay, so if I want to programmatically click the Close button, I can go to the upper right corner of the window, move down SM_CY­FRAME + 1 pixels, move left move down SM_CX­FRAME + 1 pixels, and click there, and it will hit the button."

    And then Windows Vista shows up, adds some SM_CX­PADDED­BORDER between the Close button and the right edge, and the program stops working.

    Publishing the redlines would force the visual layout to be locked in stone. Windows 95 could not have added the Close button. Windows Vista could not have added extra padding around the buttons.

    Note that changing the visual layout of the caption does not break programs which draw their own caption bar. They will continue to draw the caption bar their own custom way. If they tried to mimic the Windows 2000 caption bar, then they will continue to mimic the Windows 2000 caption bar, even on Windows Vista. But nobody gets hurt, because the application is doing both the drawing and the hit-testing, so it remains in sync with itself.

  • The Old New Thing

    Wow, they really crammed a lot into those 410 transistors

    • 29 Comments

    A colleague of mine pointed out that in yesterday's Seattle Times, there was an article about Moore's Law. To illustrate the progress of technology, they included some highlights, including the following piece of trivia:

    The Core 2 Duo processor with 410 transistors made its debut in 2002.

    You can see the photo and caption in the online version of the article if you go to the slide show and look at photo number three.

    This is an impressive feat. Intel managed to cram a Core 2 Duo into only an eighth as many transistors as the 6502.

    On the other hand, it does help to explain why the chip has so few registers. There weren't any transistors left!

  • The Old New Thing

    It rather involved being on the other side of this airtight hatchway: Invalid parameters from one security level crashing code at the same security level (yet again)

    • 23 Comments

    It's the bogus vulnerability that keeps on giving. This time a security researcher found a horrible security flaw in Sys­Alloc­String­Len:

    The Sys­Alloc­String­Len function is vulnerable to a denial-of-service attack. [Long description of reverse-engineering deleted.]

    The Sys­Alloc­String­Len does not check the length parameter properly. If the provided length is larger than the actual length of the buffer, it may encounter an access violation when reading beyond the end of the buffer. Proof of concept:

    SysAllocStringLen(L"Example", 0xFFFFFF);
    

    Credit for this vulnerability should be given to XYZ Security Labs. Copyright © XYZ Security Labs. All rights reserved.

    As with other issues of this type, there is no elevation. The attack code and the code that crashes are on the same side of the airtight hatchway. If your goal was to make the process crash, then instead of passing invalid parameters to the Sys­Alloc­String­Len function, you can launch the denial of service attack much more easily:

    int __cdecl main(int, char**)
    {
        ExitProcess(0);
    }
    

    Congratulations, you just launched a denial-of-service attack against yourself.

    In order to trigger an access violation in the Sys­Alloc­String­Len function, you must already have had enough privilege to run code, which means that you already have enough privilege to terminate the application without needing the Sys­Alloc­String­Len function.

    Once again, we have a case of MS07-052: Code execution results in code execution

    Earlier in the series:

    Bonus bogus vulnerability report:

    The Draw­Text function is vulnerability to a denial-of-service attack because it does not validate that the lpchText parameter is a valid pointer. If you pass NULL as the second parameter, the function crashes. We have found many functions in the system which are vulnerable to the same issue.

    ¹ Now, of course, if there were some way you could externally induce a program into passing invalid parameters to the Sys­Alloc­String­Length function, then you'd be onto something. But even then, the vulnerability would be in the program that is passing the invalid parameters, not in the Sys­Alloc­String­Length function itself.

  • The Old New Thing

    The details of the major incident were not clearly articulated, but whatever it is, it's already over

    • 15 Comments
    When a server is taken offline, be it a planned unplanned outage or an unplanned unplanned outage or something else, the operations team send out a series of messages alerting customers to the issue.

    Some time ago, I received a notification that went like this:

    From: Adam Smith
    Subject: Nosebleed Service : Major Incident Notification - Initial
    Date: mm/dd/yyyy 1:16AM

    Major Incident Notification

    dfdsfsd

    Affected Users

    fdfsdfsdf

    Start: mm/dd/yyyy 12:00AM Pacific Standard Time
    mm/dd/yyyy 8:00AM UTC
    End: No ETA at this time.

    Incident Duration: 1 hour 15 minutes

    Impact

    fsdfdsfsdf

    Continued Notifications

    fdsfsdf

    Information & Support

    • Other Support: Please send questions or feedback to

    Thank you,

    Adam Smith
    IT Major Incident Management

    Well that clears things up.

    Curiously, the message includes an incident duration but doesn't have an ETA. Thankfully, the message was sent one minute after the incident was over, so by the time I got it, everything was back to normal.

  • The Old New Thing

    Finding the constructor by scanning memory for the vtable

    • 5 Comments

    In Looking for leaked objects by their vtable, we used the object's constructor to locate the vtable, and then scanned the heap for the vtable to find the leaked object. But you can run this technique in reverse, too.

    Suppose you found an object and you want to find its constructor. This is not a problem if you have the source code, but if you are doing some reverse-engineering for application compatibility purposes, you don't have the luxury of the application source code. You may have figured out that the application fails because the byte at offset 0x50 is zero, but on the previous version of Windows, it was nonzero. You want to find out who sets the byte at offset 0x50, so that you can see why it is setting it to zero instead of a nonzero value.

    If the object has a vtable, you can scan the code segments for a copy of the vtable. It will show up in an instruction like

    mov dword ptr [reg], vtable_address
    

    This is almost certainly the object's constructor, setting up the object vtable as part of construction. You can set a breakpoint here to break when the object is constructed, and then you can set a write breakpoint on offset 0x50 to see where its value is seto.

  • The Old New Thing

    Sure, we have RegisterWindowMessage and RegisterClipboardFormat, but where are DeregisterWindowMessage and DeregisterClipboardFormat?

    • 23 Comments

    The Register­Window­Message function lets you create your own custom messages that are globally unique. But how do you free the message format when you're done, so that the number can be reused for another message? (Similarly, Register­Clipboard­Format and clipboard formats.)

    You don't. There is no Deregister­Window­Message function or Deregister­Clipboard­Format function. Once allocated, a registered window message and registered clipboard format hangs around until you log off.

    There is room for around 16,000 registered window messages and registered clipboard formats, and in practice exhaustion of these pools of numbers is not an issue. Even if every program registers 100 custom messages, you can run 160 unique programs before running into a problem. And most people don't even have 160 different programs installed in the first place. (And if you do, you almost certainly don't run all of them!) In practice, the number of registered window messages is well under 1000.

    A customer had a problem with exhaustion of registered window messages. "We are using a component that uses the Register­Window­Message function to register a large number of unique messages which are constantly changing. Since there is no way to unregister them, the registered window message table eventually fills up and things start failing. Should we use Global­Add­Atom and Global­Delete­Atom instead of Register­Window­Message? Or can we use Global­Delete­Atom to delete the message registered by Register­Window­Message?"

    No, you should not use Global­Add­Atom to create window messages. The atom that comes back from Global­Add­Atom comes from the global atom table, which is different from the registered window message table. The only way to get registered window messages is to call Register­Window­Message. Say you call Global­Add­Atom("X") and you get atom 49443 from the global atom table. Somebody else calls Register­Window­Message("Y") and they get registered window message number 49443. You then post message 49443 to a window, and it thinks that it is message Y, and bad things happen.

    And you definitely should not use Global­Delete­Atom in a misguided attempt to deregister a window message. You're going to end up deleting some unrelated atom, and things will start going downhill.

    What you need to do is fix the component so it does not register a lot of window messages with constantly-changing names. Instead, encode the uniqueness in some other way. For example, instead of registering a hundred messages of the form Contoso user N logged on, just register a single Contoso user logged on message and encode the user number in the wParam and lParam payloads. Most likely, one or the other parameter is already being used to carry nontrivial payload information, so you can just add the user number to that payload. (And this also means that your program won't have to keep a huge table of users and corresponding window messages.)

    Bonus chatter: It is the case that properties added to a window via Set­Prop use global atoms, as indicated by the documentation. This is an implementation detail that got exposed, so now it's contractual. And it was a bad idea, as I discussed earlier.

    Sometimes, people try to get clever and manually manage the atoms used for storing properties. They manually add the atom, then access the property by atom, then remove the properties, then delete the atom. This is a high-risk maneuver because there are so many things that can go wrong. For example, you might delete the atom prematurely (unaware that it was still being used by some other window), then the atom gets reused, and now you have a property conflict. Or you may have a bug that calls Global­Delete­Atom for an atom that was not obtained via Global­Add­Atom. (Maybe you got it via Global­Find­Atom or Enum­Props.)

    I've even seen code that does this:

    atom = GlobalAddAtom(name);
    
    // Some apps are delete-happy and run around deleting atoms they shouldn't.
    // If they happen to delete ours by accident, things go bad really fast.
    // Prevent this from happening by bumping the atom refcount a few extra
    // times so accidental deletes won't destroy it.
    GlobalAddAtom(name);
    GlobalAddAtom(name);
    

    So we've come full circle. There is a way to delete an unused atom, but people end up deleting them incorrectly, so this code tries to make the atom undeletable. Le Chatelier's Principle strikes again.

Page 1 of 96 (960 items) 12345»