In my last entry, I attempted to illustrate (hopefully with some degree of success) the reasoning behind viewing software as an organism, and all of the associated learning we may gain from such a comparison. In this entry, I am hoping to clarify this analogy a bit more, in order to provide for us a launching point to leverage this analogy more productively.

The aspect I hope to clarify specifically is the boundaries of the organism, as well as the boundaries of taxonomy. What would we consider a single instance of a software organism? What defines a species? This does influence how we are able to draw some conclusions, so I believe this exercise truly is important.

Consider, for example, Microsoft Word (which I am using to author this post). The underlying DNA behind this application is the binary code for the 2003 version, Service Pack 1, with all of the latest patches applied. Does this particular instance of Microsoft Word represent an organism, or do all instances of a particular version (at the micro-level, meaning that the next time a patch comes out I will have a new version), put together, represent a single organism?

The best logical argument I can come up with will classify the entire collection of instances of a particular version as a single organism. Using our analogy, consider the human body. It consists of a large number of cells, each containing the same DNA. These cells, depending on their location, will exhibit a phenotype that depends on the chemical environment within and around that cell.

In a similar way, one instance of Microsoft Word may be trying to operate in an environment where it cannot survive. (For example, an operating system other than the one the developers targeted.) It may be operating in an environment where it does not perform well (such as an instance running on a very busy e-commerce server). It may be running on a computer where a virus has changed its binary code – literally modifying its DNA so it exhibits a different phenotype. The overall health of the organism will not necessarily harm the organism itself, until such time as inflexibility to variations in the electronic environment cause people to stop acquiring and using the product, eliminating it in a process of natural selection in favor of a superior alternative.

It is convenient that this classification also happens to be very useful. By considering every instance of a version as an organism, we can then consider other instances of the same species (in this case, a rival word processor) as well as ancestry (previous versions of the same application).

It also offers us the opportunity to measure success – perhaps using sales figures, download rates, and lifespan. We gain the concept of selection. When a developer releases a version of software, that software organism grows to a particular size. The nature and rate of growth of that organism determines if the developer creates another version (organism). It also may give rise to competitive organisms, which seek the same resources (money) that the existing organism is consuming.

This provides a strong analogy to evolution. Genetic code. Selection. Mutation. Embryology (the environment in which the organism grows). Assuming that we agree on this as a starting point, it’s probably about time to start leveraging this analogy productively rather than continue to strengthen the case for using it.

I have had a couple of comments regarding where I am going with this. All have suggested generic algorithms, which are interesting, but this was not where I was originally heading. (Of course, now I feel almost obligated to head there at some point.) To Ralf, I say yes – I do intend to explore how we could leverage this to build the next ERP system. :-)

In addition, somebody contacted me because he was unable to leave a comment on the blog itself. For now, I have enabled anonymous comments, and we will see how that goes.