Why does it take so long to ship Hello World?

Why does it take so long to ship Hello World?

  • Comments 20

A recent comment on a Slashdot story about the Longhorn / WinFS announcement asks why WinFS is taking so long to develop, and predicts that once it is released there will be an open source clone within months. (Something similar happened with the CLR and the Mono project; it took thousands of person-years for Microsoft to develop the .NET Framework, but much less time for the Ximian folks to build a compatible implementation).

That's just the way the world works.

Let's take for example everyone's favourite program, Hello world. Once K&R published "Hello world," everyone knew what it was supposed to do and they could trivially write clones of it. People started writing "Hello world" in different languages (both human and computer). They started adding bells and whistles, like printf("Hello %s\n", argv[1]) so that you could get the computer to say "Hello" to whatever name was given on the command line (or complete garbage if no such argument was given :-) ). People did all sorts of things, and it only took a few short seconds to whip up a "Hello world" clone and share it with your friends.

But how long did it take K&R to write the original? Obviously I don't know, but based on how things happen at Microsoft, I can imagine...

Please note that this post is not an "open source only copies what Microsoft does" post. It is simply a "designing and building a new product takes a long time" post.

First of all, you have to think about the "customer" and the "scenarios." Sure, you're writing a book about the C programming language, but who is the real customer? Is it solely for college kids studying CS-101? What is "hip" and "cool" for college kids these days? How could the program -- their very first introduction to the language -- best connect with the Youth of America?

Or is it for professional programmers, already well versed in COBOL or FORTRAN or some other language? What kinds of day-to-day tasks would they be familiar with that your program could emulate so as to show how they could translate their existing skill sets to C? (Or were you trying to show how much faster / better / more powerful / easier to use / etc. C was than That Other Language?) You could spend weeks writing long e-mail threads, having heated debates at meetings, hiring consultants to do user studies and market sizing activities and so on just trying to answer this simple question.

The average room full of monkeys could have banged out at least a couple dozen variations on "Hello world" in the time it took you to decide this.

So now you've figured out who the customer is and what scenarios you're trying to enable (print a simple message on the screen)... what next? OK, what should the exact text of the message be? How universally recognisable is it? Will it offend anyone of a different culture? How easy is it to localise it for other languages? Will it generate Product Support calls because people don't understand the message? How easy is it to test that the correct text was displayed once the program has run?

After some more meetings and e-mail threads, you settle on "Hello, world" as the text because you figure "Hello" is a pretty universally recognised greeting, and "world" is pretty inclusive of all peoples on the planet (hopefully no aliens will be running your program!). But now someone in the group brings up the issue of extensibility. Presumably you're writing this program so that other people can build on it, right? So shouldn't it include at least some form of customisability, or provide obvious entry points for further expansion (such as the snippet listed above)? What good is "Hello world" if all it can do is print out "Hello world" and there's no way for the customer to modify it to meet their critical business needs?

So now you re-visit your target customer / scenario decision for another week, just to make sure that ease-of-extension-to-solve-critical-business-needs is not one of your goals for this program. Now a month has passed, but at least everyone is on board with The Vision for the program -- it's just a program to print out "Hello, world" and is not the foundation for Microsoft Excel or SQL Server.

Meanwhile, the room full of monkeys has shipped "War and Peace."

In four different languages.

OK, you're going to display some text to the user, and the text is "Hello, world" -- but how do you get the text on the screen? (We'll pretend that GUIs and web browsers and so on haven't been invented yet, so there's no debate about which windowing API or which GUI framework to use, etc). Do you use printf or puts (or fputs or fputc in a loop or...)? Obviously printf is more powerful and lets the customer experiment more with the program, but that's not the goal of your program (see previous paragraph). The puts function is less powerful, but it will automatically put a newline at the end of the string and not confuse newcomers with the strange \n syntax. (Oooooh, a new issue to track! Do we need a newline at the end of the string or not? Let's set up a meeting!)

You decide after some time that although the goal of the first program isn't to be immediately extensible by end-users, you do want to build on it in the book to introduce new concepts and so for that reason printf is the way to go. Having the \n in there is a bit confusing, but it lets you talk about character escapes which the users need to figure out pretty darn soon anyway. Some people on the team have reservations about using printf (it will cause customer confusion and hence generate calls to PSS), but time is marching on and you have to ship something soon (the publisher keeps calling wanting to know how the book is coming along).

OK, you finally have the program code written, but for historical reasons the source file is named kosciusko.cpp (the code name of the project was "Mt. Kosciusko") and the legal department doesn't want you to ship it that way. You hastily get together with everyone on the team and decide re-name the file to hello.cpp, re-run all your tests, update all the documentation, etc. and a week later you're good to go.

By this time, the monkeys have formed their own advanced civilisation and invented JScript .NET, thereby rendering your new "C" language completely irrelevant.

And we didn't even go into the details of testing, localisation, globalisation, documentation, support, usability, accessibility, security, servicing, versioning, marketing, evangelism, training, and so on. The point is that it takes a very long time to design and build a brand new product that will be used by tens (or hundreds) of millions of people, many of whom have little or no knowledge of how computers work. Once the design is done and the first version has shipped, banging out the code to make a clone is relatively easy. It's also possible to build a better / faster / more feature-rich version, too, because you have a "known quantity" to work with. You can take "Hello, world" and add command-line arguments to it pretty easily, because all the hard work (like figuring out that printf was the right function to use) has already been done!

A great book that goes into this process at Microsoft in more detail is I Sing the Body Electronic, although it is now over ten years old. Showstopper! is another classic book about the making of Windows NT, although it too is quite old. A more recent book that shows some similar problems defining and building games at id Software is the entertaining Masters of Doom.

  • Please note that this post is not an "Microsoft only copies what Apple, Mozilla have done long ago" post. It is simply a reminder that "Apple will ship Tiger with same or more features sooner than Longhorn, Mozilla has already shipped XUL, and Sun did cross platform Java ages ago" post.

    So how exactly does WinFS, Avalon, XAML and .NET/CLR classify as innovation? Weren't they copies of something or the other which was or will be done in far less time?
    "banging out the code to make a clone is relatively easy" - Even then Longhorn takes so much time, pity.
    Perhaps putting in all those security holes is what takes so much time.

    Guess who's first generation monkey then?
  • Very nice explenation!
  • Imagine a blog entry where I explain what I've been up to as of late while shipping the first version of the System Definition Model.
  • "The average room full of monkeys could have banged out at least a couple dozen variations on 'Hello world' in the time it took you to decide this."

    And they do so, and let the market decide which is best. The market can easily do this pretty cheaply, since all the variations are free.

    Microsoft can't do this. It has to make one version and it has to be *Right*.

    This is why Microsoft is ultimately doomed. The development model is just too slow to keep up.
  • "..it took thousands of person-years for Microsoft to develop the .NET Framework..". But why?

    I realise this has been said a hundred times already, but didn't Mono do to .NET what .NET did to Java? Where's the innovation in WinFS? Is it not very similar to BeFS (created in 1996)?


  • Dennis, thanks for the comment.

    You are missing one important thing though: There can't be a dozen different implementations of the product yet because *nobody knows what the product is*. (The monkeys can ship a dozen implementations, but since they're just a dozen programs out of the infinite number of other random files they produced, you don't know which ones are the "right" ones).

    Kind of like the "Total Library" problem.

    And I'd rather have one implementation that was "right" than a dozen that were "partially right" (but in a dozen different ways).
  • Kevin: I don't know enough about WinFS or BeFS to comment exactly, but I'm pretty sure that BeOS didn't have to worry about being useable by hundreds of millions of users, or being compatible with tens of thousands of existing applications, or working with (?) thousands of terabytes of existing user data.

    Building a v1 product is hard enough; building a v1 product that needs to be compatible with ten or twenty years of legacy is even harder.
  • Sorry for the delays in comments getting posted; it seems this new moderation system isn't working too well :-(
  • .NET vs Java:

    It took Sun a long time to build the original Java runtime, and then to add stuff like J2EE and JSP on top of it.

    How long did it take Microsoft to build the original J++ product? Not long at all, because it was an implementation of Java without anything really new added. And for a while it was routinely quoted as being the fastest / most standards compliant, too.

    The CLR took a long time to build precisely because it is *not* Java (even though it does have some similar goals).

    XUL vs Avalon:

    I don't know enough about this to comment, but maybe you can figure it out for yourself. Answer the question "Why do we need XUL when we already have DHTML?" and then imagine there is a similar set of reasons for why we might want Avalon over and above what XUL does.
  • Peter:
    You avoided answering why is Tiger taking less time to ship even when it will ship all the stuff Longhorn will copy - See CoreImage, CoreData, Spotlight.

    And XUL is much more than just DHTML. Please visit http://www.xulplanet.com to see. If DHTML was equal to XUL Microsoft could have built IE using DHTML. (Till now I haven't heard anyone building a browser, Mail client, etc... using DHTML, Mozilla has done it using XUL and done it successfully.) It's far more elegant and cross-platform than any one can even dream of!
  • Jokers: It's far more elegant and cross-platform than any one can even dream of!

    <sarcasm>
    I guess we'd better tell all the universities to shut down their research labs then, since everything has already been thought of by the creators of XUL.
    </sarcasm>

    Only the "chrome" of the broswer / mail app is built with XUL, not the browser itself. The chrome could be built with DHTML as well, it's just that no-one has bothered to do so. (Well, you could say that Hotmail is a mail client interface built with DHTML...).

    re Tiger: I don't have to answer every question ;-)
  • "..building a v1 product that needs to be compatible with ten or twenty years of legacy is even harder.."

    A valid argument, but reading between the lines the overall impression I take from this post is that 'Microsoft keep missing deadlines because were always creating something new'. I'm not sure if this is point you're trying to make but it sounds like what you’re really trying to say is 'Microsoft keep missing deadlines because our customers have been using Windows for blooming ages and we have to deal with all that history'.

    If that's the case then how does the original .NET / Mono argument relate? Apart from 'Interop / C++' issues (which I can see might take a little bit of thought, but not thousands of hours) what exactly did you have to make compatible?

  • Jokers? Isn't spotlight just a search engine with an index? I believe even w2k has seach indexing (which you have to turn on).

    I also believe that DirectX predates/CoreImage by more than half a decade.

    And what exactly is CoreData ?

    As for Tiger, it will take less time to ship because basically its much simpler.

    BTW: Indigo is looking amazing!

  • It's probably a bit of both. I can't speak for everyone at Microsoft (in fact I speak for no-one but myself), but *many* times a feature is horribly delayed / scrapped entirely because we can't figure out how to make it "discoverable" or "usable" by someone who's been using the previous version of the product for X years.

    Like trying to remove the steering wheel + pedals from a car and replacing them with an XBox controller :-)

    The .NET -> Mono aside got mixed up with Java -> .NET comparisons. The *original* comparison was WinFS -> hypothetical clone. I hope it makes more sense in that light; once an implementation exists, cloning the feature set takes less time than building an entirely new (even if similar) product.

    I also didn't intend for it to be related to compatibility concerns (although obviously C++ / COM / VB / JScript / etc. compatiliby was important). The compatibility play was for the BeFS -> WinFS comparison.

    There are now officially too many different comparisons / analogies going on in this post! :-)
  • [I edited Joker's post to remove copy & paste text from another web site; please don't do that since it may violate Copyright laws]

    Nobody: C'mon. I mean even on XP, with a fully indexed disk, it takes minutes to search just for a file. And you don't have the kind of functionality of Spotlight either. So no Spotlight is not 'indexed disk'. Go figure what it is. (There are demos on apple.com/).
    I couldn't understand how *exactly* DirectX crud
    predates CoreImage. Mind to explain?

    Here is what coredata is from Apple Insider -
    [ed: removed direct quote]
    http://appleinsider.com/article.php?id=593

    Peter:
    <Sarcasm for Sarcasm> I thought all of the most complex (more the time it takes, more the resources it requires, lesser the use, more complex it is;) innovations and research happened only at Microsoft. Ok. So there are universities doing research as well;)
    </Sarcasm for Sarcasm>
    Whatever you said about DHTML vs. XUL - No one is going to buy that. BTW The hotmail analogy didn't really mean anything, did it? (So is this blog page a client interface, and then there is air, water etc. but, I mean whatt?)

    And I don't think you can compare this Longhorn marketing crap with University research, even it is very complex judging by the fact it took so long to build. They invented UNIX, they invented a Mouse, they invented the Web. That _was_ research. What Microsoft is doing with Longhorn - It's no research, by any means. It's just marketing FAD.

    Yes you don't have to answer the Apple question - The answer is well known!



Page 1 of 2 (20 items) 12