Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

XP and Systems Programming

XP and Systems Programming

  • Comments 13

In Raymond's blog post today, he mentioned that if you didn't want the GetQueuedCompletionStatus to return when a handle is set to the signaled state, that you can set the bottom bit of an event handle - that would suppress notifications.

The very first comment (from "Aaargh!") was that that was ugly.  And he's right.  Aaargh! suggested that instead a new "suppressCompletionNotifications" parameter be added to the functions that could use this mechanism that would achieve the same goal.

And that got me to thinking about XP and systems programming (XP as in eXtreme Programming, not as in Windows XP) in general.

One of the core tenets of XP is refactoring - whenever you discover that your design isn't working, or if you discover an opportunity for code sharing, refactor the code to achieve that.

So how does this work in practice when doing systems programming.

I'd imagine that the dialog goes something like this:

Program Manager: "Hmm.  We need to add a more scalable queue management solution because we're seeing lots of lock convoy issues in our OS."

Architect: "Let me think about it...  We can do that - we'll add a new kernel synchronization structure that maintains the queue in the kernel, and add an API returns when there's an item put onto the queue.  We'll then let that kernel queue be associated with file handles, so that when I/O completes on the file, the "wait for the item" API returns.  The really cool thing about this idea is that we just need to add a couple of new APIs, and we can hide all the work involved in the kernel so that no application needs to be modified unless it wants to use this new feature."

Program Manager: "Sounds great!  Go for it"

<Time Passes...  The feature gets designed and implemented>

Tester, to Developer: "Hmm.  I was testing this new completion port mechanism you guys added.  I associated a completion port to a serial device, and I noticed that when I associated my file handle, my completion port was being signaled for every one of my calls.  That's really annoying.  I only want it to be signaled when a ReadFile or WriteFile call completes, I don't want it to be called when a call to DeviceIoControl completes, since I'm making the calls to DeviceIoControl out-of-band.  We need a mechanism to fix this."

At this point, we have an interesting issue that shows up. Let's consider what happens when you apply XP as a solution...

Developer, to Architect, sometime later: "Ya know, Tester's got a point.  This is clearly a case that we missed in our design, we need to fix this.  This is clearly an opportunity for refactoring, so we'll simply add a new "suppressCompletionNotifications" to all the APIs that can cause I/O completions to be signalled."

Architect: "Yup, you're right".

Developer goes out and adds the new suppressCompletionNotifications parameter to all the APIs involved.  He changes the API signature for 15 or so different APIs, fixes the build breaks that this caused, rebuilds the system and hands it to the test team.

Tester: "Wait a second.  None of my test applications work any more - you changed the function signature of WriteFile, and now I the existing compiler can't write data from the disk!"

Ok, that was a stupid resolution, and no developer in their right mind would do that, because they know that adding a new parameter to WriteFile would break applications.  But XP says that you refactor when stuff like this happens.  Ok, so maybe you don't refactor the existing APIs.  What about adding new versions of the APIs.  Let's rewind the tape a bit and try again...

Developer goes out and adds the a new variant of all the APIs involved that has a new "suppressCompletionNotifications" parameter to all the APIs involved.  In fact, he's even more clever - he adds a "flags" parameter to the API and defines "suppressCompletionNotifications" as one of the flags (thus future-proofing his change).  He adds 15 or so different APIs, and then he runs into WriteFileEx.  That's a version of WriteFile that adds a completion routine.  Crud.  Now he needs FOUR different variants of WriteFile - two that have the new flag, and two that don't.  But since refactoring is the way to go, he presses on, builds the system and hands it to the tester.

Tester: "Hey, there are FOUR DIFFERENT APIs to write data to a file.  Talk about API bloat, how on earth am I supposed to be able to know which of the four different APIs to call?  Why can't you operating system developers just have one simple way of writing a byte to the disk?"

Tester (muttering under his breath): "Idiots".

Now let's rewind back to the starting point and reconsider the original problem.

Developer, to Architect, sometime later: "Ya know, Tester's got a point.  This is clearly a case that we missed in our design, we need to fix this.  I wonder if there's some way, that we could encode this desired behavior without changing any of our API signatures"

Architect: "Hmm.  Due to the internal design of our handle manager, the low two bits of a handle are never set.  I wonder if we could somehow leverage these bits and encode the fact that you don't want the completion port to be fired in one of those bits..."

Developer: "Hmm. That could work, let's try it."

And that's how design decisions like this one get made - the alternative to exploiting the low bit of a handle is worse than exploiting the bit.

And it also points out another issue with XP: Refactoring isn't compatible with public interfaces - once you've shipped, your interfaces are immutable.  If you decide you need to refactor, you need a new interface, and you must continue to support your existing interfaces, otherwise you break clients.

And when you're the OS, you can't afford to break clients.

Refactoring can be good as an internal discipline, but once you've shipped, your interfaces are frozen.

  • Well, it can be argued that the public interfaces should be designed such that future mutations can be easier to implement and use. One way to expose functionality is to offer COM interfaces whereever possible and approximations of COM model with versioning support elsewhere.

    One problem with Win32 APIs is been that there's no consistency in the API design. As someone pointed out the're probably islands of consistency in the best case. Some APIs are downright horrible, for e.g. Shell APIs (SHGetPathFromIDList etc.) that take string arguments and assume buffer size.

    Given the maturity Microsoft has shown in developing various development processes, I wonder why there isn't a microsoft standard guideline in exposing a public API?

  • Amit, good points.

    Actually, the CLR/C# with its built-in concepts of optional parameters goes a long way towards solving (or at least ammeliorating) this issue - at a minimum, it solves the "Four versions of WriteFile" problem.

    And there are now microsoft standard guidelines for exposing public APIs - for managed code.
  • Well, you don't *have* to freeze your interfaces...but your clients just can't upgrade as freely as they'd like to. In some cases it may be impossible to avoid changing/invalidating your APIs (MyObject.EjectFloppyDisk() comes to mind in a few years...), although more flexibly-written APIs can help alleviate that (FloppyDiskObject.Eject() or MyObject.Eject(DiskTypeID)). Which gets into what Amit was talking about with API guidelines. Could make a good article...
  • It doesn't help that Windows has been in development for, what, almost 20 years now? Naturally there are vast inconsistencies in the APIs, the "best practices" for API design have changed numerous times over that period. That's due to increased experience in the programming industry, and the radical changes in processing power. Using COM for everything on a 12MHz processor with 4MB of ram would have been crazy.
  • Wasn't the whole point of passing structures to API calls to handle this. Is there APIEvolunologist that can map API design to time/team/person?
  • David,
    Yes, the data structures were done to add some level of API portability, but for a huge number of APIs, it's a massive inconvenience - there's a fine trade-off between passing a structure and passing the contents of the structure in multiple parameters.

  • I liked the concept of structures. But they are a major hassle, especially the SizeOf field. I use calc.exe to calculate them why I desperately try to remember binary sizes of data types. (I have no idea what the value of true is, in MS products I worked any negative, any positive, -1, +1, or 0 have been true [or false on a different product]).

    Eg a vb varient bool is 4 bytes (2 states), but a byte is 1 byte (256 states) and who knows as a varient. I guess this is to align.

    Still without having a clue what exactly varients are I gave up on real data types, except for API calls, years ago and haven't had any problems. But it removes the conceptual model of data being loaded in to registers and acted on by an ALU. While I only have experience of ALU as descrete components (and 4 bit) I just scale it up .My job as a apprentice telecommunication technician when I left school coincides with the birth of the 4004 - but this was really high tech stuff and we were lucky to get 4 flip flops on 1 IC.

    At least telephone exchanges worked in base 10, and NOT BCD base 10 either. IIRC base 50 for finding a free line.
  • A nice way to solve this would have been to add a dwFlags member to the OVERLAPPED structure. However, OVERLAPPED doesn't have a cbSize member for version control, so that option is also out.
  • I can't see how refectoring relates to this.
    From the post:
    "One of the core tenets of XP is refactoring - whenever you discover that your design isn't working, or if you discover an opportunity for code sharing, refactor the code to achieve that."
    "Refactoring isn't compatible with public interfaces - once you've shipped, your interfaces are immutable."

    This is false. Refactoring is about modifying working code, not to change its behavior but to make it more easy to read, understand and modify. So you don't change interfaces when you refector, you don't even modify the interface behavior. After refactoring a piece of code, you get the very same results you got before the refactoring, the only visible changes are in the code.
  • First of all, most XP practitioners would argue that what you are talking about here is not refactoring because you are changing the testable interface. This is redesign. Which is OK, but as you point out quite clearly, doing so when you have a published interface that people are already consuming is problematic.

    XP basically tells the developers to do nothing that doesn't add value to the customers. If the customer for an API is programmers, then breaking their API would be something that removed value and should be avoided.

    If the dialogue you present took place during the lifecycle of a _single release_, and not across releases, then I would argue that the developers and architects are free to make the kind of change you show. But, if the customer demands backward compatibility as a guiding principle (and ours do, right?), then making such a change would be wrong.
  • Mmm... But still you have not fixed the "four versions of write", because from the tester point of view, it has to test all the four versions that actually exist. In fact, hiding the "bits inside the handlers" may lead the tester to forget some cases that are hidden by the change.

    Best regards,
    diego
  • Diego, you're right, the test team needs to test the new functionality. That's always the case. But you don't need to write an entire new test suite for the brand new API, instead you add a couple of test cases for the existing API. It's an order of magnitude less work.
  • If you have to change your API use the deprecation tag before you do.
Page 1 of 1 (13 items)