Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Larry's Rules of software engineering, Part 4 - Writing servers is easy, writing clients is HARD.

Larry's Rules of software engineering, Part 4 - Writing servers is easy, writing clients is HARD.

  • Comments 24

Over the past 20 years or so, I've written both (I wrote the first NT networking client and I wrote the IMAP and POP3 servers for Microsoft Exchange), so I think I can state this with some authority.  I want to be clear - it's NOT easy to write a server - especially a high performance server.  But it's a heck of a lot easier to write a server than it is to write a client.

 

Way back when, when I joined the NT project (back in 1989ish), my job was to write the network file system (redirector) for NT 3.1.

Before that work item was assigned to me, it was originally on one of the senior developers on the team's plate. The server was assigned to another senior developer.

When I first looked at the schedules, I was surprised.  The development schedule for both the server AND the client was estimated to be about 6 months of work.

Now I've got the utmost respect for the senior developers involved.  I truly do.  And the schedule for the server was probably pretty close to being correct.

But the client numbers were off.  Way off.  Not quite an order of magnitude off, but close.

You see, the senior developer who had done the scheduling had (IMHO) forgotten one of the cardinal rules of software engineering:

Writing servers is easy, writing clients is hard.

If you think about it for a while, it actually makes sense.  When you're writing a server, the work involved is just to ensure that you implement the semantics in the specification - that you issue correct responses for the correct inputs.

But when you write a client, you need to interoperate with a whole host of servers.  Each of which was implemented to ensure that it implements the semantics in the specification.

But the thing is, the vast majority of protocol specifications out there don't fully describe the semantics of the protocol.  There are almost always implementation specifics that leak through the protocol abstraction.  And that's what makes the life of a client author so much fun. 

These leaks can be things like the UW IMAP server not allowing more than one connection to SELECT a mailbox at a time when the mailbox was in the MBOX format.  This is a totally reasonable architectural restriction (the MBOX file format doesn't allow the server to support multiple clients simultaneously connecting  to the mailbox), and the IMAP protocol is mute on this (this is not quite true: there are several follow-on RFCs that clarify this behavior).  So when you're dealing with an IMAP server, you need to be careful to only ever use a single TCP connection (or to ensure that you never SELECT the same mailbox on more than one TCP connection).

They can be more subtle.  For example the base HTML specification doesn't really allow for accurate placement of elements.  But web site authors often really want to be able to exactly place their visual elements.  Some author figured out that if you insert certain elements in a particular order, they can get their web site laid out in the form they want.  Unfortunately, they were depending on ambiguity in the HTML protocol (and yes, HTML is a protocol).  That ambiguity was implemented in one way with one particular browser. 

But every other browser had to deal with that ambiguity in the same way as the first browser if they wanted to render the web site properly.  It's all nice and good to say to the web site author "Fix your darned code", but the reality is that it doesn't work.  The web site author might not give a hoot about whether the site looks good for your browser, as long as it looks good on the browser that's listed on the site, they're happy campers. 

The server (in this case the web site author) simply pushes the problem onto the client.  It's easier - if the client wants to render the site correctly, they need to be ambiguity-for-ambiguity compatible with the existing browser.

Ambiguity is a huge part of what makes making clients so much fun.  In fact, I'm willing to bet that every single client for every single network protocol implemented by more than one vendor has had to make compromises in design forced by ambiguities in the design of the protocol (this may not be true for protocols like DCE RPC where the specification is so carefully specified, but it's certainly true for most other protocols).  Even a well specified protocol like IMAP has had 114 clarifications made to the protocol between RFC 2060 and RFC3501 (the two most recent versions of the protocol).  Not all the clarifications were to resolve ambiguities (some resolved spelling errors and typos), but the majority of them were to deal with ambiguities.

Clients also have to deal with multiple versions of  a protocol.  For CIFS clients, the client needs to be able to understand how to talk to at least 7 different versions of the protocol, and they need to be able to implement their host OS semantics on every one of those versions.  For the original NT 3.1 redirector, more than 3/4ths of the specification for the redirector was taken up with how each and every single Win32 API would be implemented against various versions of the server.  And each and every one of those needed specific code paths (and test cases) in the client.  For the server, each of the protocol dialects was essentially the same - you needed to know how to implement the semantics of the protocol on the server's OS. 

For the client, on the other hand, you had to pick and choose which of the protocol elements was most appropriate given the circumstances.  As a simple example, for the IMAP protocol, clients have two different access mechanisms - you can access the messages in a mailbox by UID or by sequence number.  UIDs have some interesting semantics (especially if the client's going to access the mailbox offline), but sequence numbers have different semantics.  The design of the client heavily depends on this choice - there are things you can't do if you use UIDs but there's a different set of things you can't do if you use sequence numbers.  It's a really tough design decision that will quite literally reflect the quality of your client - is your IMAP client nothing more than a POP3 client on steroids, or does it fully take advantage of the protocol?  Another decision made by clients: Do they fetch the full RFC 2822 header from the server and parse it on the client, or do they fetch only the elements of the header that they're going to display?

So when you're thinking about writing networking software, just remember the rule:

Writing servers is easy, writing clients is hard.

You'll be happy you did.

  • I'd say "Writing something that doesn't have to interact with previous implementations of the other side is easy, writing something that does is hard". I'm sure writing IIS, OWA etc to be bug-compatible with old browsers (Netscape 4.x, that tiny non-upgradable browser in 5-year-old PalmPilots, etc) is just as tricky.
  • Like the previous commenter, I don't see where the client/server split is in this. In general, both clients and servers have to operate with a variety of interpretations of the protocols they nominally speak.

    Clients are only harder to write than servers if servers are generally less accurate in their intepretation of specifications than clients are. Which would, of itself, imply that servers must be harder to get right than clients, which is why they're more likely to get it wrong.

    QED, not.
  • Jonathan and Will,
    You're missing half the point - regardless of leaky abstraction issues, clients have to make decisions of what protocol elements to use against what servers. Servers don't. All the server has to do is to figure out how to implement the protocol elements on the OS on the server. Now this may not be a trivial problem, but it's containable.

    On the other hand, client's need to deal with differences in server implementations AND in protocol differences. A really simple example: A client needs to choose between using HTTP 1.0 and HTTP 1.1 when it interacts with an HTTP server. The semantics of each protocol are subtly different, and the client has to support both. IMAP clients need to know if they're going to support IMAP2bis servers or just IMAP4rev1 servers. And they need to have code to handle both of these cases. CIFS clients need to determine which of the seven different CIFS variants that they're going to support, and how to implement platform-specific features against servers that don't necessarily support those platforms.

    You're right that servers have to deal with client variation - the Exchange IMAP server has code in it to deal with a buggy IMAP client from a 3rd party vendor. The client was in clear violation of the protocol, but we changed the server in the name of compatibility.

    But the SERVER was orders of magnitude easier to write than the client. Even though the server's semantics weren't quite those of IMAP, we were able to do that work in relatively little time. But the IMAP client (especially a quality IMAP client) took far longer to write (and get correct).
  • I'm afraid I'm still missing it. I don't doubt that you can find specific examples where the rules of a protocol make it easier at one end than the other, but I still don't see that this is a general fact of life.

    I don't even follow the HTTP example - surely a decent webserver needs to support HTTP1.1 for performance AND 1.0/0.9 for older clients. Meanwhile, a client could merely support 1.0 if it chose to be simpler? Of course, in real life they probably both need to support both to be considered 'good', but I still don't think we're proving an asymmetry here.
  • Larry,

    this, btw, also is true for windows clients, not only network clients... :-) [and yes, i _did_ write my share of networking clients and servers.]

    WM_CHEERS
    thomas woelfer
  • Will,
    One example I gave was in the article - an IMAP server needs to figure out how to map the semantics of the protocol onto their engine, but when that's done, it's done. A client, on the other hand has a whole menu of protocol options to choose from. Some perfectly legal choices result in a crappy client. Others result in a high quality client. The server has no such choices.

    Another example has to do with CIFS. There are at least four different CIFS verbs for reading data from a file: Read, Read&X, Raw Read, and Blocked Read (this is a rough approximation). Each of these has a different performance characteristic, and each of these is supported by different servers - Win2K servers support all 4 of them, NT 3.1 support the first three, Lanman servers support the first two and MS-NET servers only support the first. So when you're talking to an NT server, which of the verbs is the best one to use? Well, it depends on a lot of different things - how much resources you have on the client, what you believe is going to be the next operation performed by the client, how large the data transfer is, what's the state of the network connection between the server and client, etc.

    Making that choice is HARD. On the other hand, for the server the choice is really easy (trivial, in fact). It just has to take the read request and turn it into a read request to the filesystem (raw read and blocked reads are slightly more complicated because they need buffers larger than the default server buffer, but they're still trivial).
  • I was explaining to someone the other day why I have such a great respect for shell devs. I wish I'd had this client/server metaphor in my repertoire. It makes a more compelling argument than my "they have to understand everyone else's stuff and still know all about UI".
  • Nice piece of advice :)

    I may be naive, but I have one question that begs to be asked:

    In my ideal world (no, I haven't implemented clients and servers for a living... yet :)), protocol implementations are simply deterministic finite automata. Just like parsers are, for instance. The natural question is: For describing programming languages we have de facto standards like the EBNF notation; why isn't there a standard for describing network protocols? This would simply throw uncertainty out of the window...
  • Yes. Finite state machines and Petri nets are often used to model protocols.

    An FSM can produce a highly detailed and unambiguous design for a protocol, if properly written and interpreted. Tanenbaum covers the topic quite nicely (Computer Networks, ISBN 0-13-394248-1).

    I've also read a little bit about languages that can be used to describe protocols, but I'm not familiar with any.
  • Addition:

    Formal specifications are excellent at describing syntax and deterministic operations, but terrible at describing semantics. BNF, for example, does not describe any semantics. Just take a look at the Algol68 specification - formally defined, and impossible to implement.

    It's pretty difficult to formally define the semantics of anything nontrivial.
  • I suppose, a server says "This is how things are, deal with it!" (Which is easy.)
    And the client has to deal with it, and all the other types of "it" that come from different servers. (Which is hard.)
  • I think that depends on the server - doesn't the reverse case also apply? eg, clients say "This is how I want xyz, deal with it"?

    I can't think of a good example of that off the top of my head, though :)
  • Charlie, that's a good way of putting it.
Page 1 of 2 (23 items) 12