• The Old New Thing

    Advantages of knowing your x86 machine code


    Next time you find yourself debugging in assembly language (which for some of us is the only way we debug), here are some machine code tricks you may wish to try out:

    This is the single-byte NOP opcode. If you want to patch out code and don't want to think about it, just whack some 90's over it. To undo it, you have to patch the original code bytes back in, of course.

    This is the single-byte INT 3 opcode, which breaks into the debugger.

    These are the opcodes for JZ and JNZ. If you want to reverse the sense of a test, you can swap one for the other. Other useful pairs are 72/73 (JB/JNB), 76/77 (JBE/JA), 7C/7D (JL/JGE), and 7E/7F (JLE/JG). You don't have to memorize any of these values; all you have to recognize is that toggling the bottom bit reverses the sense of the test. To undo this, just flip the bit a second time.

    This is the unconditional short jump instruction. If you want to convert a conditional jump to an unconditional one, change the 74 (say) to EB. To undo this, you have to remember what the original byte was.

    On the other hand, if you want to convert a conditional short jump to a never-taken jump, you can patch the second byte to zero. For example, "74 1C" becomes "74 00". The jump is still there; it just jumps to the next instruction and therefore has no effect. To undo this, you have to remember the original jump offset.

    These are the opcodes for the "MOV EAX,immed32" and the "CALL" instructions. I use them to patch out calls. If there's a call to a function that I don't like, instead of wiping it all to 90's, I just change the E8 to a B8. To undo it, change the B8 back to an E8.

    It has been pointed out that this works only for functions that take zero stack parameters; otherwise, your stack gets corrupted. More generally, you can use 83 C4 XX 90 90 (ADD ESP, XX; NOP; NOP) where XX is the number of bytes you need to pop. Personally, I don't remember the machine code for these instructions so I tend to rewrite the CALL instruction so it calls the "RETD" at the end of the function.

    I prefer these single-byte patches to wholesale erasure with 90 because they are easier to undo if you realize that you want to restore the code to the way it was before you messed with it.

  • The Old New Thing

    Why does Windows not recognize my USB device as the same device if I plug it into a different port?


    You may have noticed that if you take a USB device and plug it into your computer, Windows recognizes it and configures it. Then if you unplug it and replug it into a different USB port, Windows gets a bout of amnesia and thinks that it's a completely different device instead of using the settings that applied when you plugged it in last time. Why is that?

    The USB device people explained that this happens when the device lacks a USB serial number.

    Serial numbers are optional on USB devices. If the device has one, then Windows recognizes the device no matter which USB port you plug it into. But if it doesn't have a serial number, then Windows treats each appearance on a different USB port as if it were a new device.

    (I remember that one major manufacturer of USB devices didn't quite understand how serial numbers worked. They gave all of their devices serial numbers, that's great, but they all got the same serial number. Exciting things happened if you plugged two of their devices into a computer at the same time.)

    But why does Windows treat it as a different device if it lacks a serial number and shows up on a different port? Why can't it just say, "Oh, there you are, over there on another port."

    Because that creates random behavior once you plug in two such devices. Depending on the order in which the devices get enumerated by Plug and Play, the two sets of settings would get assigned seemingly randomly at each boot. Today the settings match up one way, but tomorrow when the devices are enumerated in the other order, the settings are swapped. (You get similarly baffling behavior if you plug in the devices in different order.)

    In other words: Things suck because (1) things were already in bad shape—this would not have been a problem if the device had a proper serial number—and (2) once you're in this bad state, the alternative sucks more. The USB stack is just trying to make the best of a bad situation without making it any worse.

  • The Old New Thing

    A history of GlobalLock, part 4: A peek at the implementation


    On one of our internal discussion mailing lists, someone posted the following question:

    We have some code that was using DragQueryFile to extract file paths. The prototype for DragQueryFile appears as follows:

    UINT DragQueryFile(
        HDROP hDrop,
        UINT iFile,
        LPTSTR lpszFile,
        UINT cch

    In the code we have, instead of passing an HDROP as the first parameter, we were passing in a pointer to a DROPFILES structure. This code was working fine for the last few months until some protocol changes we made in packet layouts over the weekend.

    I know that the bug is that we should be passing an HDROP handle instead of a pointer, but I am just curious as to why this worked so flawlessly until now. In other words, what determines the validity of a handle and how come a pointer can sometimes be used instead of a handle?

    GlobalLock accepts HGLOBALs that refer to either GMEM_MOVEABLE or GMEM_FIXED memory. The rule for Win32 is that for fixed memory, the HGLOBAL is itself a pointer to the memory, whereas for moveable memory, the HGLOBAL is a handle that needs to be converted to a pointer.

    GlobalAlloc works closely with GlobalLock so that GlobalLock can be fast. If the memory happens to be aligned just right and pass some other tests, GlobalLock says "Woo-hoo, this is a handle to a GMEM_FIXED block of memory, so I should just return the pointer back."

    The packet layout changes probably altered the alignment, which in turn caused GlobalLock no longer to recognize (mistakenly) the invalid parameter as a GMEM_FIXED handle. It then went down other parts of the validation path and realized that the handle wasn't valid at all.

    This is not, of course, granting permission to pass bogus pointers to GlobalLock; I'm just explaining why the problem kicked up all of a sudden even though it has always been there.

    With that lead-in, what's the real story behind GMEM_MOVEABLE in Win32?

    GMEM_MOVEABLE memory allocates a "handle". This handle can be converted to memory via GlobalLock. You can call GlobalReAlloc() on an unlocked GMEM_MOVEABLE block (or a locked GMEM_MOVEABLE block when you pass the GMEM_MOVEABLE flag to GlobalReAlloc which means "move it even if it's locked") and the memory will move, but the handle will continue to refer to it. You have to re-lock the handle to get the new address it got moved to.

    GMEM_MOVEABLE is largely unnecessary; it provides additional functionality that most people have no use for. Most people don't mind when Realloc hands back a different value from the original. GMEM_MOVEABLE is primarily for the case where you hand out a memory handle, and then you decide to realloc it behind the handle's back. If you use GMEM_MOVEABLE, the handle remains valid even though the memory it refers to has moved.

    This may sound like a neat feature, but in practice it's much more trouble than it's worth. If you decide to use moveable memory, you have to lock it before accessing it, then unlock it when done. All this lock/unlock overhead becomes a real pain, since you can't use pointers any more. You have to use handles and convert them to pointers right before you use them. (This also means no pointers into the middle of a moveable object.)

    Consequently, moveable memory is useless in practice.

    Note, however, that GMEM_MOVEABLE still lingers on in various places for compatibility reasons. For example, clipboard data must be allocated as moveable. If you break this rule, some programs will crash because they made undocumented assumptions about how the heap manager internally manages handles to moveable memory blocks instead of calling GlobalLock to convert the handle to a pointer.

    A very common error is forgetting to lock global handles before using them. If you forget and instead just cast a moveable memory handle to a pointer, you will get strange results (and will likely corrupt the heap). Specifically, global handles passed via the hGlobal member of the STGMEDIUM structure, returned via the GetClipboardData function, as well as lesser-known places like the hDevMode and hDevNames members of the PRINTDLG structure are all potentially moveable. What's scary is that if you make this mistake, you might actually get away with it for a long time (if the memory you're looking at happened to be allocated as GMEM_FIXED), and then suddenly one day it crashes because all of a sudden somebody gave you memory that was allocated as GMEM_MOVEABLE.

    Okay, that's enough about the legacy of the 16-bit memory manager for now. My head is starting to hurt...

  • The Old New Thing

    A history of GlobalLock, part 3: Transitioning to Win32


    Now that you know how the 16-bit memory manager handled the global heap, it's time to see how this got transitioned to the new 32-bit world.

    The GlobalAlloc function continued to emulate all its previous moveability rules, but the return value of GlobalAlloc was no longer a selector since Win32 used the processor in "flat mode".

    This means that the old trick of caching a selector and reallocating the memory out from under it no longer worked.

    Moveability semantics were preserved. Memory blocks still had a lock count, even though it didn't really accomplish anything since Win32 never compacted memory. (Recall that the purpose of the lock count was to prevent memory from moving during a compaction.)

    Moveable memory and locking could have been eliminated completely, if it weren't for the GlobalFlags function. This function returns several strange bits of information—now entirely irrelevant—the most troubling of which is the lock count. Consequently, the charade of locking must be maintained just in case there's some application that actually snoops at the lock count, or a program that expected the GlobalReAlloc function to fail on a locked block.

    Aside from that, moveable memory gets you nothing aside from overhead.

    The LocalAlloc function also carries the moveability overhead, but since local memory was never passed between DLLs in Win16, the local heap functions don't carry as much 16-bit compatibility overhead as the global heap functions. LocalAlloc is preferred over GlobalAlloc in Win32 for that reason. (Of course, many functions require a specific type of memory allocation, in which case you don't have any choice. The clipboard, for example, requires moveable global handles, and COM requires use of the task allocator.)

    Next time, an insight into how locking is implemented (even though it doesn't do anything).

  • The Old New Thing

    Ein hundert Dinge, die in den Vereinigten Staaten besser bleiben


    I think it is a trait common to many people that they are fascinated by how their country is viewed by others. The leftist Die Tageszeitung from Germany reacted to the result of the most recent U.S. presidential election with their list of one hundred things that are still better in the United States.

    Those who cannot read German can use a translation provided by a Metafilter reader (which is where I found this article, if you haven't figured it out). But of course it's better to read something in its original language if you can.

  • The Old New Thing

    A history of GlobalLock, part 2: Selectors


    With the advent of the 80286, Windows could take advantage of that processor's "protected mode". processor. There was still no virtual memory, but you did have memory protection. Global handles turned into "descriptors", more commonly known as "selectors".

    Architectural note: The 80286 did have support for both a "local descriptor table" and a "global descriptor table", thereby making it possible to have each process run in something vaguely approximating a separate address space, but doing so would have broken Windows 1.0 compatibility, where all memory was global.

    Addresses on the 80286 in protected mode consisted of a selector and an offset rather than a segment and an offset. This may seem like a trivial change, but it actually is important because a selector acts like a handle table in hardware.

    When you created a selector, you specified a whole bunch of attributes, such as whether it was a code selector or a data selector, whether it was present or discarded, and where in memory it resided. (Still no virtual memory, so all memory is physical.)

    GlobalAlloc() now returned a selector. If you wanted to, you could just use it directly as the selector part of an address. When you loaded a selector, the CPU checked whether the selector was present, discarded, or invalid.

    • If present, then everything was fine.
    • If discarded, a "not present" exception was raised. (Wow, we have exceptions now!) The memory manager trapped this exception and did whatever was necessary to make the selector present. This meant allocating the memory (possibly compacting and discarding to make room for it), and if it was a code selector, loading the code back off the disk and fixing it up.
    • If invalid, an Unrecoverable Application Error was raised. This is the infamous "UAE".

    Since memory accesses were now automatically routed through the descriptor table by the hardware, it meant that memory could be moved around with relative impunity. All existing pointers would remain valid since the selector remains the same; all that changes is the internal bookkeeping in the descriptor table that specified which section of memory the descriptor referred to.

    For compatibility with Windows 1.0, GlobalAlloc() continued to emulate all the moveability rules as before. It's just that the numeric value of the selector never really changed any more. (And please let's just agree to disagree on whether backwards compatibility is a good thing or not.)

    Next time, transitioning to Win32.

  • The Old New Thing

    A history of GlobalLock, part 1: The early years


    Once upon a time, there was Windows 1.0. This was truly The Before Time. 640K. Segments. Near and far pointers. No virtual memory. Co-operative multitasking.

    Since there was no virtual memory, swapping had to be done with the co-operation of the application. When there was an attempt to allocate memory (either for code or data) and insufficient contiguous memory was available, the memory manager had to perform a process called "compaction" to make the desired amount of contiguous memory available.

    • Code segments could be discarded completely, since they can be reloaded from the original EXE. (No virtual memory - there is no such thing as "paged out".) Discarding code requires extra work to make sure that the next time the code got called, it was re-fetched from memory. How this was done is not relevant here, although it was quite a complicated process in and of itself.
    • Memory containing code could be moved around, and references to the old address were patched up to refer to the new address. This was also a complicated process not relevant here.
    • Memory containing data could be moved around, but references to the old addresses were not patched up. It was the application's job to protect against its memory moving out from under it if it had a cached pointer to that memory.
    • Memory that was locked or fixed (or a third category, "wired" -- let's not get into that) would never be moved.

    When you allocated memory via GlobalAlloc(), you first had to decide whether you wanted "moveable" memory (memory which could be shuffled around by the memory manager) or "fixed" memory (memory which was immune from motion). Conceptually, a "fixed" memory block was like a moveable block that was permanently locked.

    Applications were strongly discouraged from allocating fixed memory because it gummed up the memory manager. (Think of it as the memory equivalent of an immovable disk block faced by a defragmenter.)

    The return value of GlobalAlloc() was a handle to a global memory block, or an HGLOBAL. This value was useless by itself. You had to call GlobalLock() to convert this HGLOBAL into a pointer that you could use.

    GlobalLock() did a few things:

    • It forced the memory present (if it had been discarded). Other memory blocks may need to be discarded or moved around to make room for the memory block being locked.
    • If the memory block was "moveable", then it also incremented the "lock count" on the memory block, thus preventing the memory manager from moving the memory block during compaction. (Lock counts on "fixed" memory aren't necessary because they can't be moved anyway.)

    Applications were encouraged to keep global memory blocks locked only as long as necessary in order to avoid fragmenting the heap. Pointers to unlocked moveable memory were forbidden since even the slightest breath -- like calling a function that happened to have been discarded -- would cause a compaction and invalidate the pointer.

    Okay, so how did this all interact with GlobalReAlloc()?

    It depends on how the memory was allocated and what its lock state was.

    If the memory was allocated as "moveable" and it wasn't locked, then the memory manager was allowed to find a new home for the memory elsewhere in the system and update its bookkeeping so the next time somebody called GlobalLock(), they got a pointer to the new location.

    If the memory was allocated as "moveable" but it was locked, or if the memory was allocated as "fixed", then the memory manager could only resize it in place. It couldn't move the memory either because (if moveable and locked) there were still outstanding pointers to it, as evidenced by the nonzero lock count, or (if fixed) fixed memory was allocated on the assumption that it would never move.

    If the memory was allocated as "moveable" and was locked, or if it was allocated as "fixed", then you can pass the GMEM_MOVEABLE flag to override the "may only resize in place" behavior, in which case the memory manager would attempt to move the memory if necessary. Passing the GMEM_MOVEABLE flag meant, "No, really, I know that according to the rules, you can't move the memory, but I want you to move it anyway. I promise to take the responsibility of updating all pointers to the old location to point to the new location."

    (Raymond actually remembers using Windows 1.0. Fortunately, the therapy sessions have helped tremendously.)

    Next time, the advent of selectors.

  • The Old New Thing

    Why do I sometimes see redundant casts before casting to LPARAM?


    If you read through old code, you will often find casts that seem redundant.

    SendMessage(hwndListBox, LB_ADDSTRING, 0, (LPARAM)(LPSTR)"string");

    Why was "string" cast to LPSTR? It's already an LPSTR!

    These are leftovers from 16-bit Windows. Recall that in 16-bit Windows, pointers were near by default. Consequently, "string" was a near pointer to a string. If the code had been written as

    SendMessage(hwndListBox, LB_ADDSTRING, 0, (LPARAM)"string");

    then it would have taken the near pointer and cast it to a long. Since a near pointer is a 16-bit value, the pointer would have been zero-extended to the 32-bit size of a long.

    However, all pointers in window messages must be far pointers because the window procedure for the window might very well be implemented in a different module from the sender. Recall that near pointers are interpreted relative to the default selector, and the default selector for each module is different. Sending a near pointer to another module will result in the pointer being interpreted relative to the recipient's default selector, which is not the same as the sender's default selector.

    The intermediate cast to LPSTR converts the near pointer to a far pointer, LP being the Hungarian prefix for far pointers (also known as "long pointers"). Casting a near pointer to a far pointer inserts the previously-implied default selector, so that the cast to LPARAM captures the full 16:16 far pointer.

    Aren't you glad you don't have to worry about this any more?

  • The Old New Thing

    What was the point of the GMEM_SHARE flag?


    The GlobalAlloc function has a GMEM_SHARE flag. What was it for?

    In 16-bit Windows, the GMEM_SHARE flag controlled whether the memory should outlive the process that allocated it. By default, all memory allocated by a process was automatically freed when that process exited.

    Passing the GMEM_SHARE flag suppressed this automatic cleanup. That's why you had to use this flag when allocating memory to be placed on the clipboard or when you transfer it via OLE to another process. Since the clipboard exists after your program exits, any data you put on the clipboard needs to outlive the program. If you neglected to set this flag, then once your program exited, the memory that you put on the clipboard would be cleaned up, resulting in a crash the next time someone tried to read that data from the clipboard.

    (The GMEM_SHARE flag also controlled whether the memory could be freed by a process other than the one that allocated it. This makes sense given the above semantics.)

    Note that the cleanup rule applies to global memory allocated by DLLs on behalf of a process. Authors of DLLs had to be careful to keep track of whether any particular memory allocation was specific to a process (and should be freed when the process exited) or whether it was something the DLL was planning on sharing across processes for its own internal bookkeeping (in which case it shouldn't be freed). Failure to be mindful of this distinction led to bugs like this one.

    Thank goodness this is all gone in Win32.

  • The Old New Thing

    And to think they let me get away with it for five years


    According to the Constitution of the State of New Jersey, Article II, paragraph 6:

    No idiot or insane person shall enjoy the right of suffrage.

    Why are constitutional articles labelled with Roman numerals? Makes it sound like the Super Bowl or something.

    The state appellate court did rule a few years ago that being hospitalized for psychiatric treatment does not constitute being insane for the purpose of determining voter eligibility.

    Today is Election Day in the United States. Don't forget to vote! (Void where prohibited.)

Page 369 of 426 (4,251 items) «367368369370371»