• The Old New Thing

    If your callback fails, it's your responsibility to set the error code


    There are many cases where a callback function is allowed to halt an operation. For example, you might decide to return FALSE to the WM_NCCREATE message to prevent the window from being created, or you might decide to return FALSE to one of the many enumeration callback functions such as the EnumWindowsProc callback. When you do this, the enclosing operation will return failure back to its caller: the CreateWindow function returns NULL; the EnumWindows function returns FALSE.

    Of course, when this happens, the enclosing operation doesn't know why the callback failed; all it knows is that it failed. Consequently, it can't set a meaningful value to be retrieved by the GetLastError function.

    If you want something meaningful to be returned by the GetLastError function when your callback halts the operation, it's the callback's responsibility to set that value by calling the SetLastError function.

    This is something that is so obvious I didn't think it needed to be said; it falls into the "because computers aren't psychic (yet)" category of explanation. But apparently it wasn't obvious enough, so now I'm saying it.

  • The Old New Thing

    The vtable does not always go at the start of the object


    Although the diagrams I presented in my discussion of The layout of a COM object place the vtable at the beginning of the underlying C++ object, there is no actual requirement that it be located there. It is perfectly legal for the vtable to be in the middle or even at the end of the object, as long as the functions in the vtable know how to convert the address of the vtable pointer to the address of the underlying object. Indeed, in the second diagram in that article, you can see that the "q" pointer indeed points into the middle of the object.

    Here's an example that puts the vtable at the end of the object:

    class Data {
     Data() : m_cRef(1) { }
     virtual ~Data() { }
     LONG m_cRef;
    class VtableAtEnd : Data, public IUnknown {
     STDMETHODIMP QueryInterface(REFIID riid, void **ppvOut)
      if (riid == IID_IUnknown) {
       *ppvOut = static_cast<IUnknown*>(this);
       return S_OK;
      *ppvOut = NULL;
       return E_NOINTERFACE;
      return InterlockedIncrement(&m_cRef);
      LONG cRef = InterlockedDecrement(&m_cRef);
      if (!cRef) delete this;
      return cRef;

    The layout of this object may very well be as follows: (Warning: Diagram requires a VML-enabled browser.)

    p    IUnknown.vtbl    QueryInterface

    Observe that in this particular object layout, the vtable resides at the end of the object rather than at the beginning. This is perfectly legitimate behavior. Although it is the most common object layout to put the vtable at the beginning, COM imposes no requirement that it be done that way. If you want to put your vtable at the end and use negative offsets to access your object's members, then more power to you.

  • The Old New Thing

    How air conditioning revolutionized competitive bicycling


    I'm not really interested in sports. Teams, standings, scores, who got traded to what team, none of that is interesting to me. What I am interested in, however, is "meta-sports": The business of sports, the technology of sports, the evolution of techniques, changes in the rules, that sort of thing. That's one of the reasons I'm a fan of the radio program Only a Game. (The other, more important, reason can be summed up in two words: Charlie Pierce.)

    All that is a rather lengthy lead-in to Transition Game, Nick Schulz's look at the world behind sports. He covers what it is about sports that I like, with none of the stuff I don't like. (I've linked to him before, but I like him so much I'm going to do it again.) You too can learn how air conditioning revolutionized competitive bicycling. Or you can learn about the use of robots as camel jockeys in Qatar. Here's a picture. It's like an episode of Futurama come to life.

  • The Old New Thing

    The cost of trying too hard: String searching


    There are many algorithms for fast string searching, but the running of a string search is inherently O(n), where n is the length of the string being searched: If m is the length of the string being searched for (which I will call the "target string"), then any algorithm that accesses fewer than n/m elements of the string being searched will have a gap of m unaccessed elements, which is enough room to hide the target string.

    More advanced string searching algorithms can take advantage of characteristics of the target string, but in the general case, where the target string is of moderate size and is not pathological, all that the fancy search algorithms give you over the naive search algorithm is a somewhat smaller multiplicative constant.

    In the overwhelming majority of cases, then, a naive search algorithm is adequate. As long as you're searching for normal strings and not edge cases like "Find aaaaaaaaaaaaaaab in the string aaaaaaaaaaaaaabaaaaaaaaaaaaaaab". If you have a self-similar target string, the running time of a naive search is O(mn) where m is the length of the target string. The effort in the advanced searching algorithms goes towards diminishing the effect of m, but pay for it by requiring preliminary analysis of the target string. If your searches are for "relatively short" "normal" target strings, then the benefit of this analysis doesn't merit the cost.

    That's why nearly all library functions that do string searching use the naive algorithm. The naive algorithm is the correct algorithm over 99% of the time.

  • The Old New Thing

    From Doom to Gloom: The story of a video game


    NPR's Morning Edition developed a series on the subject of flops, and one of their segments was devoted to the rise and fall of John Romero. You can read more about the phenomenon known as Daikatana in a huge series on Gamespot. Set aside at least an hour if you choose to read it. You can also read the Dallas Observer story that opened the floodgates.

  • The Old New Thing

    The cost of trying too hard: Splay trees


    Often, it doesn't pay off to be too clever. Back in the 1980's, I'm told the file system group was working out what in-memory data structures to use to represent the contents of a directory so that looking up a file by name was fast. One of the experiments they tried was the splay tree. Splay trees were developed in 1985 by Sleator and Tarjan as a form of self-rebalancing tree that provides O(log n) amortized cost for locating an item in the tree, where n is the number of items in the tree. (Amortized costing means roughly that the cost of M operations is O(M log n). The cost of an individual operation is O(log n) on average, but an individual operation can be very expensive as long as it's made-up for by previous operations that came in "under budget".)

    If you're familiar with splay trees you may already see what's about to happen.

    A very common operation in a directory is enumerating and opening every file in it, say, because you're performing a content search through all the files in the directory or because you're building a preview window. Unfortunately, when you sequentially access all the elements in a splay tree in order, this leaves the tree totally unbalanced. If you enumerate all the files in the directory and open each one, the result is a linear linked list sorted in reverse order. Locating the first file in the directory becomes an O(n) operation.

    From a purely algorithmic analysis point of view, the O(n) behavior of that file open operation is not a point of concern. After all, in order to get to this point, you had to perform n operations to begin with, so that very expensive operation was already "paid for" by the large number of earlier operations. However, in practice, people don't like it when the cost of an operation varies so widely from use to use. If you arrive at a client's office five minutes early for a month and then show up 90 minutes late one day, your explanation of "Well, I was early for so much, I'm actually still ahead of schedule according to amortized costing," your client will probably not be very impressed.

    The moral of the story: Sometimes trying too hard doesn't work.

    (Postscript: Yes, there have been recent research results that soften the worst-case single-operation whammy of splay trees, but these results weren't available in the 1980's. Also, remember that consistency in access time is important.)

  • The Old New Thing

    ReadProcessMemory is not a preferred IPC mechanism


    Occasionally I see someone trying to use the ReadProcessMemory function as an inter-process communication mechanism. This is ill-advised for several reasons.

    First, you cannot use ReadProcessMemory across security contexts, at least not without doing some extra work. If somebody uses "runas" to run your program under a different identity, your two processes will not be able to use ReadProcessMemory to transfer data back and forth.

    You could go to the extra work to get ReadProcessMemory by adjusting the privileges on your process to grant PROCESS_VM_READ permission to the owner of the process you are communicating with, but this opens the doors wide open. Any process running with that identity read the data you wanted to share, not just the process you are communicating with. If you are communicating with a process of lower privilege, you just exposed your data to lower-privilege processes other than the one you are interested in.

    What's more, once you grant PROCESS_VM_READ permission, you grant it to your entire process. Not only can that process read the data you're trying to share, it can read anything else that is mapped into your address space. It can read all your global variables, it can read your heap, it can read variables out of your stack. It can even corrupt your stack!

    What? Granting read access can corrupt your stack?

    If a process grows its stack into the stack guard page, the unhandled exception filter catches the guard exception and extends the stack. But when it happen inside a private "catch all exceptions" handler, such as the one that the IsBadReadPtr Function uses, it is handled privately and doesn't reach the unhandled exception filter. As a result, the stack is not grown; a new stack guard page is not created. When the stack normally grows to and then past the point of the prematurely-committed guard page, what would normally be a stack guard exception is now an access violation, resulting in the death of the thread and with it likely the process.

    You might think you could catch the stack access violation and try to shut down the thread cleanly, but that is not possible for multiple reasons. First, structured exception handling executes on the stack of the thread that encountered the exception. If that thread has a corrupted stack, it becomes impossible to dispatch that exception since the stack that the exception filters want to run on is no longer viable.

    Even if you could somehow run these exception filters on some sort of "emergency stack", you still can't fix the problem. At the point of the exception, the thread could be in the middle of anything. Maybe it was inside the heap manager with the heap lock held and with heap data structures in a state of flux. In order for the process to stay alive, the heap data structures need to be made consistent and the heap lock released. But you don't know how to do that.

    There are plenty of other inter-process communication mechanisms available to you. One of them is anonymous shared memory, which I discussed a few years ago. Anonymous shared memory still has the problem that any process running under the same token as the one you are communicating with can read the shared memory block, but at least the scope of the exposure is limited to the data you explicitly wanted to share.

    (In a sense, you can't do any better than that. The process you are communicating with can do anything it wants with the data once it gets it from you. Even if you somehow arranged so that only the destination process can access the memory, there's nothing stopping that destination process from copying it somewhere outside your shared memory block, at which point your data can be read from the destination process by anybody running with the same token anyway.)

  • The Old New Thing

    At least there's a funny side to spam


    Poorly-drawn cartoons inspired by actual spam subject lines!

    It's pretty much what the title says. Don't forget to read the fan mail.

    Sometimes it's even funny.

  • The Old New Thing

    Understanding what things mean in context: Dispatch interfaces


    Remember that you have to understand what things mean in context. For example, the IActiveMovie3 interface has a method called get_MediaPlayer. If you come into this method without any context, you might expect it to return a pointer to an IMediaPlayer interface, yet the header file says that it returns a pointer to an IDispatch interface instead. If you look at the bigger picture, you'll see why this makes sense.

    IActiveMovie3 is an IDispatch interface. As you well know, the IDispatch interface's target audience is scripting languages, primarily classic Visual Basic (and to a lesser degree, JScript). Classic Visual Basic is a dynamically-typed language, wherein nearly all variables are merely "objects", the precise type of which is not known until run-time. A statically-typed language will complain at compile time that you are invoking a method on an object that doesn't support that method or that you are passing the wrong number or type of operands to a method. A dynamically-typed language, on the other hand, doesn't check until the line of code is actually executed whether the method exists, and if it does, whether you called it correctly.

    When working with IDispatch and dynamically-typed languages, therefore, the natural unit of currency for objects is the IDispatch. All objects take the form of IDispatch. Objects that produce other objects will produce IDispatch interfaces, because that's what the scripting engine is expecting.

    That's why the get_MediaPlayer method returns an IDispatch. Because that's what the scripting engine expects. And, if you are familiar with the context, it's also what you should expect.

    A tell-tale sign of this context comes from the name "get_MediaPlayer". This name does not follow the COM function naming convention but rather is a constructed name for the C/C++ binding of the "get" property. C/C++ bindings are the assembly language of OLE automation: You're operating with the nuts and bolts of OLE automation, and if you want to play at this level, you're going to have to know how to use a screwdriver.

  • The Old New Thing

    France, she is, how you say, on sale!


    Marketplace reports on the start of the winter sale season in France. By law, retailers are permitted sales only twice a year, so the onset of sale season generates quite a bit of shopping madness. There is also a proposal to allow more sale periods, but opponents argue that doing so would harm smaller businesses. Coming from the land of sale fatigue (we just emerged from the after-Christmas sale season and are entering the Winter White Sale season, after which comes the President's Day season...), I find a certain appeal to the idea of limiting how often things can "go on sale". Who can forget the oriental rug stores that are perpetually going out of business? It's become such a joke that The New York Times flatly refuses to run "Going Out of Business" sales for oriental rug stores.

Page 349 of 453 (4,526 items) «347348349350351»