<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Chris Jackson's Semantic Consonance : Software Evolution</title><link>http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx</link><description>Tags: Software Evolution</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>Designing the User Experience</title><link>http://blogs.msdn.com/cjacks/archive/2006/01/23/516412.aspx</link><pubDate>Tue, 24 Jan 2006 00:34:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:516412</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/516412.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=516412</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=516412</wfw:comment><description>&lt;P&gt;So far, I have covered two important aspects of software design that will impact how this software will evolve into the future. A brief recap:&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Evolvability. &lt;/STRONG&gt;When developing software, you are fighting a battle against the combinatorial mathematics that govern the complexity of any respectably sized software project. There are simply a huge number of possible approaches. Most of these approaches do not work. Of those that do, some will be better, either in terms of security, performance, maintainability, etc. Genetic algorithms are a special case of this - this is a technique for rapidly covering a problem domain without investing the man hours to actually code each separate solution, and may take approaches that seem completely non-obvious to the developer, but are in fact preferable to the straightforward approach that a human would apply to that problem. However, even if using a genetic algorithm to crawl your problem domain is not practical, there are lessons to be learned from the implementation of evolution in biology. For example, it is a bad thing when modification of a gene at locus A confers some evolutionary advantage, but simultaneously presents a phenotype that is negative. The same is true for software. At a granuar level, object oriented programming helps to solve this problem. At a higher level, there are architectural approaches that make this easier. What is important is the underlying ideal.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;Parallelism. &lt;/STRONG&gt;There is no question in my mind that massively parallel algorithms are going to be key to success in the future, even more so than they are today. Being able to design systems that are massively parallel will not only allow your software to take advantage of hardware designs that increase computation capabilities through parallelism, but they present a whole new set of opportunities to interact with external processes and services in a way that is non-blocking and more intuitive. Why should I wait for my email client to check email before I can read email that I have already downloaded?&lt;/P&gt;
&lt;P&gt;On top of this, I would add a third category, which does not have as direct a biologial analogy as the others, but is nonetheless extremely critical.&lt;/P&gt;
&lt;P&gt;&lt;STRONG&gt;User Experience.&lt;/STRONG&gt;&lt;/P&gt;
&lt;P&gt;From user experience, I am thinking of two perspectives. The first is the overall appearance, which is how we frequently think of UI Design. Given a particular screenshot, implement this in code. Why is this experience important? Here, we are trying to convey a sense of quality, and drive a strong initial first impression. Depending on the software you are creating, this may be more or less critical. Some software is of the nature that people have to use it because there is no real competition, or who make up for this first impression with solid interaction design. But I believe it also drives how you feel about using the application. Does it feel modern? This may make me more excited about using it. Does it feel of high quality? This will make me trust the software more.&lt;/P&gt;
&lt;P align=center&gt;&lt;IMG height=152 src="http://www.microsoft.com/library/media/1033/windowsvista/images/experiences/Web_MediaPlayer_album02.jpg" width=207&gt;&lt;/P&gt;
&lt;P align=center&gt;&lt;CITE&gt;Windows Media Player 11 UI Design&lt;/CITE&gt;&lt;/P&gt;
&lt;P&gt;The value of this experience is important, which is why screenshots are actually interesting to post on new products.&lt;/P&gt;
&lt;P&gt;The next perspective is of user experience, or interaction design, which is different from that I am calling UI Design. Are you able to get your work done in a more efficient manner? Are your options exposed to you in a way that you can discover them? Is the act of navigating the appliation effortless and logical? For example, consider that we are designig a VCR. Rather than sending one group away to develop a list of features (such as multiple timed recordings, playback, and manual recording) and another to design the appearance of the device itself, can we approach it differently and design it first from the perspective of interaction? Yes. We can think in terms of scenarios. One important scenario is setting the clock so it is not flashig 12:00 indefinitely! Another is the scenario where somebody wants to record what they are currently watching. The final scenario is one where we want to set up several programs to record on an ongoing basis. Given these scenarios, we can start to explore them more thoroughly. We would like to understand why so many VCRs blink 12:00 all of the time. Can we solve the first scenario more easily? Is there a time source accessible to our VCR that doesn't involve the human at all? If it must involve a human, can we design it as part of the setup experience, understanding that nobody is interested in having a flashing 12:00 sign? But if we make it part of the setup experience, now we are preventing people from using their new VCR to play a video tape right away, until they answer a bunch of our questions - should we postpone that experience until the user requests it? If we do, how to we expose a way to do it at the user's convenience? Do we expose a hardware button? Do we post a notification that can go away? We haven't even exhausted our set of questions about the first scenario, and already we have to make some difficult decisions. However, this is time well spent, as this affect everyone who uses the VCR.&lt;/P&gt;
&lt;P&gt;As I continue to evolve my blog, these are the three areas that I have run into time and time again as being important skills for the software practicioner (including myself!) both today and going into the future. To solve the hard problems that await us, we need to be able to search the problem domain more efficiently. We need to support parallelism, which will be supported more and more completely in hardware as time passes. And we need to provide a positive user experience that is focussed on fulfilling users' goals.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=516412" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Stealing Ideas: External Evolutionary Events and API Mimicry</title><link>http://blogs.msdn.com/cjacks/archive/2005/12/01/498938.aspx</link><pubDate>Thu, 01 Dec 2005 19:10:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:498938</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/498938.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=498938</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=498938</wfw:comment><description>&lt;P&gt;Last month, we released the &lt;A href="http://msdn.microsoft.com/netframework/downloads/updates/default.aspx"&gt;.NET Framework 2.0&lt;/A&gt;. However, a significant number of organizations are not going to immediately migrate to the new platform and tools. While it's not that difficult to see the benefits of using the new platform, most organizations are very careful and deliberate about changing their platform. That leaves these organizations in a situation where they can clearly know what is coming, yet they do not yet have access to these capabilities. There is an external evolutionary event that impacts their development strategies, and the wise thing to do is to incorporate this visibility into their current approach to developing software&lt;/P&gt;
&lt;P&gt;So, what do you do when you live in a .NET 1.1 world, but you still have to plan for the coming .NET 2.0 world?&lt;/P&gt;
&lt;P&gt;I am working with a very large customer who is facing this exact scenario. This customer is completely overhauling their identity management strategy. In particular, they have decided to implement an external identity management strategy. They have to strategize around things like federation and ADFS, without actually having a production implementation of this technology. Part of this strategy is to standardize where they are going to store the identities of external individuals and organizations (for, without federation, they still hold responsibility for managing the identity stores and claims of users accessing their systems).&lt;/P&gt;
&lt;P&gt;In the past, the approach to managing these identities depended on the team. Some teams were storing information in a SQL Server database. Others were utilizing an external Active Directory forest. Each individual project team had to determine the strategy for their project, and then they were responsible for the implementation. This obviously has consequences. There are manageability issues: each application has its own provisioning procedure and tools. There are security issues: writing authentication code and defining an authentication store is hard. There are user experience issues: each user could have a completely different set of credentials for each application. There are productivity issues: each team has to incorporate time into their schedule to develop the authentication store and code. The list goes on and on. Rather than perpetuate this project-driven approach, it made sense to tackle the problem holistically. This customer decided to standardize on using an LDAP directory store for their external identities: &lt;A href="http://www.microsoft.com/downloads/details.aspx?FamilyId=9688F8B9-1034-4EF6-A3E5-2A2A57B5C8E4&amp;amp;displaylang=en"&gt;ADAM (Active Directory in Application Mode)&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;Given a new infrastructure standard, the attention then goes to the application developers. Does every developer now need to be trained in using the System.DirectoryServices namespace? While there was some expertise in this before from the applications using the external Active Directory forest, this knowledge was neither complete nor widespread. Also, while we have reduced the project schedules already by the amount of time necessary to decide on and create an authentication store, wouldn't it be great if they could reduce the amount of time they invest in creating the authentication code? The "code reuse" light was flashing rather brightly here.&lt;/P&gt;
&lt;P&gt;The approach I took here was to gradually build levels of abstraction. The first step was to translate from System.DirectoryServices into actionable items. So, the first level of abstraction spoke only in terms of concepts such as Authenticate or ChangePassword. This frees them from the semantics of calling into a NativeObject (because for whatever reason there is no Authenticate method in either ADSI or System.DirectoryServices.) This fees them from the semantics of invoking into ADSI. It made it easier. It reduced the amount of code needed in every application immediately by at least 2/3, and occasionally far more.&lt;/P&gt;
&lt;P&gt;Yet, developers still would need to write quite a bit of code into each application - code that would doubtlessly be duplicated. So, I created a higher level of abstraction against the most common scenario: authentication. Every application needs to authenticate, so we can achieve some significant benefits from abstracting this away.&lt;/P&gt;
&lt;P&gt;ASP.NET provides an excellent validation framework, and it made sense to me to leverage this. Rather than calling an Authenticate method in code that you write, you could drop a validator object onto your ASP.NET page which will call this Authenticate method for you. Now, you just need to determine if Page.IsValid - if not, then you failed authentication, and the validator will display your error message for you. Another useful abstraction.&lt;/P&gt;
&lt;P&gt;Of course, validation still leaves some opportunities for improvement. Each developer still has to lay out their own page, include the validator, and then write the code to react to validation either failing or succeeding. Most login pages are going to look the same. Most of the time, the developers wanted the response to successful authentication to be the same - the standard FormsAuthentiation redirection. Why should everyone have to write this? They don't have to. We can build a third layer of abstraction on top of the validation framework: a custom web server control that contains a place to enter a user name and password, a submit button, and an authentication validator. Set up a configuration section to put the location of the ADAM store in web.config, and now you can drop the control onto your login page, set a few configuration variables, and suddenly you are implementing authentication that looks consistent across the enterprise, with a consistent authentication store, using 0 lines of code. Pretty neat, except...&lt;/P&gt;
&lt;P&gt;Oh no.&lt;/P&gt;
&lt;P&gt;It looks like I am about to build a &lt;A href="http://msdn2.microsoft.com/t863ehhh(en-US,VS.80).aspx"&gt;Login&lt;/A&gt; control - something I get for free in the .NET Framework 2.0! Of course, I also don't really want to wait for the organization to deploy the 2.0 Framework, because they can realize benefit from such a control immediately! What can I do to future-proof anything I make today?&lt;/P&gt;
&lt;P&gt;Steal.&lt;/P&gt;
&lt;P&gt;I intended to mimic all of the relevant APIs from the ASP.NET 2.0 Login control. Why? Because I knew that the organization would be moving to the 2.0 Framework at some point. It's OK for my control to become irrelevant, but I certainly don't want all of their applications to become irrelevant! If the API and behaviors are the same (and you really do need to spend a lot of time understanding the APIs to ensure that they are the same), the upgrade experience should simply be at the point of instantiation. Rather than instantiating a control of the class that I developed, they can instantiate a control of the class that the 2.0 Framework provides. Much less painful migration!&lt;/P&gt;
&lt;P&gt;So, in this scenario, I was able to generate immediate business value, while still taking into account the evolution of the surrounding platform infrastructure. &lt;A href="http://www.us.oup.com/us/catalog/general/subject/LifeSciences/EvolutionaryBiology/?view=usa&amp;amp;ci=0198528590"&gt;I simply had to rely on mimicry, a technique biological life has invented time and time again to increase the likelihood of survival.&lt;/A&gt;&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=498938" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>What is the sweet spot for genetic programming?</title><link>http://blogs.msdn.com/cjacks/archive/2005/11/10/491471.aspx</link><pubDate>Thu, 10 Nov 2005 22:45:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:491471</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/491471.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=491471</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=491471</wfw:comment><description>&lt;P&gt;&lt;a href="http://blogs.msdn.com/alexbarn/"&gt;Alex Barnett&lt;/A&gt; posted a comment in my last post referencing a video on the University of Washington website: &lt;A href="http://uwtv.org/programs/displayevent.asp?rid=902 "&gt;http://uwtv.org/programs/displayevent.asp?rid=902&lt;/A&gt;. This is a worthwhile video, and I wanted to comment on some of the contents. Of course, I freely admit my bias towards some of his opinions, because he seems to share my opinion that &lt;A href="http://www.simonyi.ox.ac.uk/dawkins/WorldOfDawkins-archive/index.shtml"&gt;Richard Dawkins&lt;/A&gt; is alpha to &lt;A href="http://www.stephenjaygould.org/"&gt;Stephen Jay Gould&lt;/A&gt; in the world of evolutionary scientists. Then again, perhaps it is my bias that caused me to infer this!&lt;/P&gt;
&lt;P&gt;In this video, Daniel Dennett shows a video showcasing some computer applications where genetic programming was used to evolve computerized "beings" that are modified and selected for using artificial selection - these selections were based on fitness as defined by swimming, walking, etc. He also posts a couple of slides from Dawkins' &lt;A href="http://www.amazon.com/gp/product/0393315703/102-5408990-0346537?v=glance&amp;amp;n=283155&amp;amp;n=507846&amp;amp;s=books&amp;amp;v=glance"&gt;The Blind Watchmaker&lt;/A&gt; - drawings of "biomorphs" produced by a genetic algorithm. The &lt;A href="http://www.amazon.com/gp/product/0393995461/102-5408990-0346537?v=glance&amp;amp;n=283155"&gt;PC version&lt;/A&gt; of this software is harder to find, but the &lt;A href="http://www.amazon.com/gp/product/0393993418/102-5408990-0346537?v=glance&amp;amp;n=283155"&gt;Mac version&lt;/A&gt; is still available on Amazon.com.&lt;/P&gt;
&lt;P&gt;Are examples such as these the reason why some people hear the term "genetic programming" and immediately think that it has nothing to offer them? When you look at the output of these programs, you do get a sense of wonder. Here is this interesting and complicated thing that nobody actually took the time to create. But is a set of boxes that can simulate walking useful to the average developer? &lt;A href="http://www.armadilloaerospace.com/n.x/johnc/Recent%20Updates"&gt;John Carmack&lt;/A&gt; can produce things that simulate walking around on your computer screen that are a lot more compelling to the average person than some boxes. Yes, it's a fascinating novelty because there was not an intelligent designer behind those boxes, but what is the practical use?&lt;/P&gt;
&lt;P&gt;Like most developers I know, I spend most of my time writing software that helps businesses run. I may want to write some value from the database on the screen. For my purposes, a simple Console.WriteLine is good enough. I may want to draw a chart on the screen. For my purposes, writing some GDI+ code to render that chart is good enough. As long as I have rendered the information quickly enough that my user isn't bored or dissatisfied, then gaining a few milliseconds may not be important. In fact, I am often more concerned with writing maintainable code to pass along to the next intelligent designer than I am in maximizing performance. Genetic programming has a tendency to produce somewhat obscure code. It may be provably correct, but more often than not after you invest a huge amount of time attempting to understand it, you will usually shrug your shoulders and think, "I would have never thought of that. I still probably wouldn't, and I have already seen it."&lt;/P&gt;
&lt;P&gt;Does that mean, however, that there is no practical use, simply because this is not a pervasive programming style today? This is where I would disagree. There are many problems that are exceptionally difficult to solve. My most common experiences with difficult problems are around performance and scalability. If I need to write one line of text to a screen, then I can do that good enough, make it understandable, and save a lot of time doing it. If I need to write one line of text to 10 million screens, all in under one second, then suddenly I have a very different problem domain. Of course, I could develop, measure, tune, measure, tune, measure, repeat until I get the application working good enough (which is, after all, what truly matters). But, depending on what I know, this could be an extensively long cycle. It also tends to be somewhat mundane. How do I know that I am tweaking the right thing? What if I never get there? Can this be automated? This automation is genetic programming.&lt;/P&gt;
&lt;P&gt;The vast majority of problems do not require such performance. Most developers can write an intranet application that works good enough. That doesn't mean that there is no benefit, that just means that the current expectations that people have of their software are being met. But if you could render more interesting information on your computer faster, or have more robust interaction with a low-bandwidth occasionally connected device, then you enable entirely new scenarios. Another place where genetic programming might be valuable.&lt;/P&gt;
&lt;P&gt;The problem, of course, is that genetic programming is hard to do. There are no robust tools to work with. The discipline is still very young. Dr. Dennett brings up an exceptionally going point, which is worth considering with regards to this problem. He points out the lack of noise in a computer simulation. The computer has nothing that you don't program into it. In one sense, this is very true. In another sense, we could consider that the computer program has been artificially filtering the noise, and then need to re-introduce some subset of that to be interesting. As I pointed out in my last post, there are 54,993,666,708,469,390,869,140,625 ways to construct a 10-machine-instruction program. That seems like a lot of noise to me! We have a huge problem with the combinatorial nature of our software, and we combat this by filtering noise and then attempting to deliberately re-introduce some of this noise.&lt;/P&gt;
&lt;P&gt;What we really want to do is find some solution to combat our combinatorial dilemma. The underlying problem we are trying to solve is how we should best go about arriving at the best possible solution out of the unfathomably huge number of alternatives. It is incredibly unlikely that even the most intelligent of intelligent designers is going to be able to pick the single best solution. There are more ways to write a program the size of &lt;A href="http://www.microsoft.com/windowsvista/default.mspx"&gt;Microsoft Windows&lt;/A&gt; than there are atoms in the entire universe. Obviously, we can't just try all of them. So, let's just wander around parts of the solution domain to see what we can solve. What is the best approach to taking a subset, yet still ensuring that you consider alternatives that may not seem intuitively obvious? Be random. Randomness and natural selection is simply a tool that we can use to wander around the set of all possible solutions, in the hope that we find something better than that which is simply most obvious to humans and our innate ability to see patterns (even when they don't actually exist).&lt;/P&gt;
&lt;P&gt;We can see the power of genetic algorithms in some of the work that has been done already. It's even more evident just by looking in a mirror. Randomly meandering about the problem domain can produce edge cases such as us that are pretty impressive. We simply need a good technical solution to introducing constrained randomness. Today, that is frequently done by limiting the number of instructions that a genetic program will use. We also introduce mutations at a controlled rate. (If your microscope is just barely out of focus, wouldn't you rather give it a small turn than a big one?) And we have observed from the biological world that sexual selection is an effective means of introducing necessary diversity when rates of reproduction are slow, and we attempt to somehow incorporate that into genetic algorithms. What is challenging is figuring out where to mutate. In the biological world, mutations are guided by chemical properties. Sexual selection is based on molecular interactions and the properties of the molecules that make the proteins. With computers, we just have a huge number of transistors that are either on or off. We have to decide how to break them up and mutate them. How do we decide what is a good way to cross two programs? I have not seen this answered satisfactorily.&lt;/P&gt;
&lt;P&gt;So, I see the sweet spot of genetic programming to be very tangible. We have huge opportunities to improve performance and scalability, both in distributed architectures and on a smart client application that renders rich graphics. I see this sweet spot as having more meaning to the average developer once there are some tools in place, as well as some meaningful standards on how to implement mutation and crossing. I don't think it's irrelevant at all - I just think it's too hard. But as technology continues to get more and more sophisticated, we are going to need to find better tools to crawl the combinatorial problem space that this sophistication brings with it. Of course, much of the work may be done at the class library level, so the average developer doesn't need to understand the underlying sophistication - just benefit from the results.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=491471" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Genetic Programming and Units of Selection</title><link>http://blogs.msdn.com/cjacks/archive/2005/10/12/480267.aspx</link><pubDate>Thu, 13 Oct 2005 00:11:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:480267</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>7</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/480267.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=480267</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=480267</wfw:comment><description>&lt;P&gt;Genetic Programming is a fascinating field of study. Essentially, this is the study of software that writes software, selecting the software it has written that exhibits the highest degree of fitness, and allowing this software to continue to evolve over time. In essence, what Genetic Programming is trying to do is find some route to a higher degree of fitness that is more efficient than either random selection or intelligent design.&lt;/P&gt;
&lt;P&gt;To understand the magnitude of the problem, consider how quickly the size of the problem can grow. Say, for example, that you are using IA-32 assembly language on a processor much like my humble little Pentium M in my Tablet PC. How many discrete instructions does the IA-32 instruction set provide? Hundreds. Let's simplify things and say that there are 375 available instructions. We'll ignore the register architecture for the moment, which adds additional complexity. There are, consequently, 54,993,666,708,469,390,869,140,625 different ways to construct a program that is only 10 instructions long. Most of these don't work. Of those combinations that do work, most of them don't do anything useful. The very small number of combinations which do perform some useful activity don't do very much of it.&lt;/P&gt;
&lt;P&gt;Most of us are all too familiar with just how easy it is to come up with a combination of instructions that just doesn't work.&lt;/P&gt;
&lt;P&gt;Now consider the fact that your program may not necessarily be 10 instructions long. You could potentially solve the exact same problem in 10 instructions, 5 instructions, or 38 instructions. In fact, any solution greater than 10 instructions could potentially be a successful solution that happens to use a lot of NOP instructions (unless, of course, the context of the EIP register happens to be important to determining the correctness of the solution). All put together, there are a LOT of ways to create a program to solve a problem, and the overwhelming majority of them are wrong.&lt;/P&gt;
&lt;P&gt;Genetic Programming is an attempt to find a quicker way to the solution from this enormous number of potential solutions. It is not generally practical to try every possible solution to the problem to discover the best one. By starting off with some solutions, selecting the best, and then randomly mutating and recombining, Genetic Programming explores the problem domain much more quickly. Of course, because it covers the set of potential solutions randomly, it is possible that it could still miss the best solution, but any alternative that does not systematically test every possible combination of instructions is subject to the same limitation. The assumption, of course, is that solutions that are a smaller number of mutations away from the "best" solution will show a higher degree of fitness on the test used by your Genetic Programming algorithm.&lt;/P&gt;
&lt;P&gt;I don't want to delve too far into Genetic Programming today. What is important to note about genetic programming is the unit of selection: genetic program modifies the source code of the most successful organisms when it is performing mutation or crossover operations.&lt;/P&gt;
&lt;P&gt;When I was doing my initial exploration of the terminology in &lt;a href="http://blogs.msdn.com/cjacks/archive/2005/06/20/430783.aspx"&gt;this post&lt;/A&gt;, I considered binary code to be analogous to DNA. This is still a perfectly suitable analogy in some regards. (That's the problem with analogies - they are never precise.) Binary code is still the device that drives the expression of phenotype. It is still sufficient to duplicate a piece of software. But, in one very important regard, I don't believe it is always the best analogy, unless you happen to be using binary code as the instruction set.&lt;/P&gt;
&lt;P&gt;An important concept is the unit of selection. If you have written an application in C#, and are maintaining that application, those constructs which you happen to save in that C# source code are those that survive to the next generation. This is, indeed, the unit of selection. The binary source code is, of course, a hugely important aspect of the ontogeny of the organism, but I wanted to redefine the analogy I had used previously to be more suitable to some of the concepts that I believe are interesting to explore. As the unit of selection, the source code (in whichever language) is the most suitable analogy to DNA for my purposes.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=480267" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Developing Less Complex Software: Gadgets and Coding for Fun</title><link>http://blogs.msdn.com/cjacks/archive/2005/10/11/479755.aspx</link><pubDate>Tue, 11 Oct 2005 22:41:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:479755</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/479755.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=479755</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=479755</wfw:comment><description>&lt;P&gt;In my last entry, I discussed complexity in evolution, and how the most highly complex software is, in fact, the edge case. Far more software is less complex; more people have written a "Hello World" program than have written an application of the complexity of, say, Microsoft BizTalk Server.&lt;/P&gt;
&lt;P&gt;This begs the question - how can I maximize my experience in building less complex applications? How do I do it at all? For anybody who loves to build fantastic software and change the world, it's important to leverage these opportunities to both improve and enjoy yourself.&lt;/P&gt;
&lt;P&gt;&lt;A href="http://www.sellsbrothers.com/spout/#My_Product_Group_Fun:_2"&gt;Chris Sells writes,&lt;/A&gt; "I found out something about myself: I'm really good at digging into the state of the art, whether it's one technology or a feature across technologies, if I have a problem I'm trying to solve. However, if I'm just wandering in a space w/o an explicit goal, e.g. give a presentation, build an app, write an article, I'm lost; I just can't muster any juice."&lt;/P&gt;
&lt;P&gt;I find this to be true as well. Say that I decide to come up with a piece of software to write which I can complete in a reasonable amount of time, gain some new experiences from, and share with as many people as I am interested in impacting. Frequently, I will spend more time coming up with the idea than I do actually implementing the project.&lt;/P&gt;
&lt;P&gt;There are a couple of useful tools that I have found for overcoming these obstacles. The first is the &lt;A href="http://msdn.microsoft.com/coding4fun/"&gt;Coding for Fun&lt;/A&gt; website. There are tons of starter ideas, as well as fully fleshed out ideas here. This is a great place to start, because you may find something you've been interested in learning here.&lt;/P&gt;
&lt;P&gt;More importantly, Microsoft is starting to define a platform for making your smaller, less complex applications more globally useful: gadgets. Not that this is a new approach. Stardock supports the notion of gadgets with its DesktopX software. Konfabulator has desktop gadgets. Desktop Sidebar has sidebar gadgets. The difference is that these gadgets ran inside of an applicaiton. They are now coming at the platform level. Check out &lt;A href="http://microsoftgadgets.com/"&gt;Microsoft Gadgets&lt;/A&gt; to read more and to find links to other information.&lt;/P&gt;
&lt;P&gt;I think this is important for a couple of reasons. First, it makes it easier to write something fairly cool in a short period of time. Why? You don't have to write all of the platform-level technologies such as containers and windows. You can just create the gadget to do something neat. Second, since there will always be more software that is less complex, we can expose that less complex software more robustly, make it more useful, and generally improve the overall ecosystem.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=479755" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Evolution, Complexity, and Software Platforms</title><link>http://blogs.msdn.com/cjacks/archive/2005/09/19/471478.aspx</link><pubDate>Mon, 19 Sep 2005 23:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:471478</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>5</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/471478.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=471478</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=471478</wfw:comment><description>&lt;P&gt;Webster's Dictionary defines evolution as, "a process of continuous change from a lower, simpler, or worse to a higher, more complex, or better state."&lt;/P&gt;
&lt;P&gt;I really hate pretty much every speech or writing which starts with the dictionary definition of something. Certainly it is a starting point, but it is probably the least creative of all starting points. It insinuates that somebody does not know the dictionary definition of a particular term. Generally, the author must arbitrarily select one definition among many, inherently including bias into the selection of the specific definition used. And did I mention that it's boring and overused? How ironic that I used it myself.&lt;/P&gt;
&lt;P&gt;Do most people agree with this particular definition? Certainly, it appears in the mass media. This is the way that I tend to hear it referred to in common speech. In my experience, this definition is similar to the one that most people would use to define the term evolution. That doesn't necessarily mean that it applies at the scientific level.&lt;/P&gt;
&lt;P&gt;Biological evolution, after all, is the nonrandom selection of random mutations. Does nonrandom selection always occur in the direction of additional complexity? The existence of complex organisms seems to suggest that this is the case. However, the existence of highly complex organisms instead more likely indicates that the variation has increased, not that selection is directional. And, of course, with mutation we expect variation to increase.&lt;/P&gt;
&lt;P&gt;This actually is somewhat intuitive, once you think about it. Given a particular starting point, a single random mutation will make the organism either somewhat more complex, or else somewhat less complex. The direction will be completely random. When the more complex mutation mutates again, it will again be either more complex or less complex. If one particular series of mutations that increased complexity happens to survive, then the result at the end of that chain is an organism that is much more complex than the starting organism. At the same time, if one particular series of mutations that decreased complexity happens to survive, then the result at the end of that chain is an organism that is much less complex than the starting organism. And, of course, there are a number of potential combinations of mutations that exist in the middle - where some of them have increased complexity and others have decreased. All that we can really say is that the variation has increased - not that complexity has been specifically selected.&lt;/P&gt;
&lt;P&gt;Where this gets complicated is when you have a boundary condition, which makes it appear as if overall complexity has been increasing. For example, if your starting point is a very simple single celled organism, it is difficult to make that organism any less complex. Any variation that occurs as mutation takes place can only take place in the direction of additional complexity, because there is not much room to simplify a bacteria while there is an enormous amount of room to make it more complex, to the point where it can be as complex as a human being sitting at his tablet PC thinking about such things as evolution, complexity, and software platforms. So, the true effect of mutations over time is that variation increases. At the end, the branch that happened to increase complexity more often than not can eventually be extremely complicated. That does not mean that the evolution was directional - just that the variation happened to manifest itself in this way. There are still plenty of simple bacteria around (more than all other life forms combined, according to all of the literature that I have come across), but because they can't get much simpler you don't see the other side of the tail. And, in fact, massive complexity truly is the tail of the distribution. A small tail of highly complex organisms does not make evolution directional. It simply represents variation.&lt;/P&gt;
&lt;P&gt;D. W. McShea at Duke University (I don't know him personally) has been doing research specifically on the nature of evolutionary trends and complexity. A sample of his work on this topic that makes an interesting and reasonable approachable read can be found &lt;A href="http://www.biology.duke.edu/mcshealab/"&gt;here&lt;/A&gt;. We will simply jump to the conclusion (although the article itself is worth a read): 
&lt;BLOCKQUOTE&gt;The results here - two cases in which probability of increase was greater and two in which probability of decrease was greater ± are consistent with and support the null hypothesis that increases and decreases are equally probable (or would if they had been randomly chosen).&lt;/BLOCKQUOTE&gt;In other words, the appearance of additional complexity can not be assumed to be anything more than the appearance of additional variation. 
&lt;P&gt;&lt;/P&gt;
&lt;P&gt;This phenomenon has some interesting parallels with regards to the evolution of software. Software itself seems to be getting more complex over time, but are we deceiving ourselves by placing too much weight on the right tail of the distribution? In fact, much software is seemingly simple, and consists of surprisingly few lines of code. Consider the hobbyist, who is just putting together relatively few lines of code to do something interesting or useful.&lt;/P&gt;
&lt;P&gt;What does vary dramatically, however, is what those few lines of code can actually do. If you go back to Petzold-style C code for windows, it takes quite a few lines of code just to get a window to appear. With the same number of lines of code today using Windows Forms or the Windows Presentation Foundation (formerly code name Avalon), you can likely do a lot more than just show a window, such as do some custom drawing or interesting animation. As platforms evolve, what that bit of code can do becomes increasingly sophisticated. At the same time, as people become more sophisticated with using the platform, they can squeeze even more out of those few lines of code. The variation increases as the body of code increases, and the right tail of the distribution becomes extremely interesting.&lt;/P&gt;
&lt;P&gt;You can see some of this effect by flipping through old issues of your favorite trade journal, such as MSDN Magazine. If you look at an issue from the beta days of the .NET Framework 1.0, you will see relatively basic samples and articles by today's standards. The same magazine today, targeting the same platform, will have much more sophisticated articles, and at the same time it will continue to have relatively straightforward and introductory articles. The variation increases along with the sophistication and maturation.&lt;/P&gt;
&lt;P&gt;Where am I going with it? Well, to some extent, I just think it's neat. But there is a point to be learned as well, even though software - following the principles of intelligent design - is not bound to the same restrictions as biological life and random variation. What point is that? Understanding when to migrate platforms in order to take advantage of the additional sophistication of the new platform.&lt;/P&gt;
&lt;P&gt;This is a decision that most organizations face on an ongoing basis. There are constantly new platforms released, which offer a huge number of features which may or may not be compelling. It is comparatively far more typical to maintain an existing application then it is to completely write a new application from scratch. How do you decide whether to port from one platform to another?&lt;/P&gt;
&lt;P&gt;Obviously, there is no easy answer, and I can not possibly understand all of the economics and the starting point of every application. But, considered in the abstract, it's important to understand exactly where your application sits on a scale of complexity, compared to where it needs to sit in the ideal sense. Assuming that your platform begins with a certain complexity built in (which it presumably does, in order to be selected over competing platforms). This complexity will have a minimum, and a practical maximum defined by the investment to achieve that maximum compared to the returns.&lt;/P&gt;
&lt;P&gt;An application that sits far in the right tail, demonstrating extreme complexity on the given platform, may not necessarily benefit from replatforming right away. What drives the replatforming decision is how much further you would like to go with your application, compared to the investment you make to undertake the replatforming.&lt;/P&gt;
&lt;P&gt;This is getting fairly abstract, so an example is probably in order. Assume that a platform begins with complexity 1. It is reasonable to expect that an application developed for this platform will eventually reach complexity 10 as developers become more sophisticated using it. Now, a new platform is released that makes development easier. The simplest application developed for this platform has a complexity of 4. The maximum practical complexity on this platform we reasonably expect to reach complexity 15.&lt;/P&gt;
&lt;P&gt;Since most applications developed are not particularly complex, they would immediately benefit from replatforming because they would already reach a complexity of 4 from a lower complexity. However, say that you spent a long time developing an application, and it had reached a complexity of 8. When you replatform, you immediately revert to a complexity of 4, and have you work your way back up to a complexity of 8 just to get to where you began. The up side? You can then keep moving, and achieve a theoretical maximum complexity of 14, much higher than before.&lt;/P&gt;
&lt;P&gt;This, in my opinion, is why you don't see applications such as Microsoft Word relatforming to .NET immediately - the investment to regain the complexity they already have is dramatic, and the decision to replatform will be made when the theoretical maximum of an existing framework is no longer sufficient.&lt;/P&gt;
&lt;P&gt;Of course, there are a huge number of other factors to weigh in to the decision, such as how much additional work is going in to the application, the knowledge base of your existing employees and those available to you in the market, the investment available for a particular application, but this gives a sense as to why you might want to replatform, and why you might want to wait or have this exercise happen in the background. It's all a part of the delicate balance between completing a new project quickly and making a new project as complex as possible to meet the needs of the marketplace.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=471478" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Defining Units of Selection</title><link>http://blogs.msdn.com/cjacks/archive/2005/08/26/456877.aspx</link><pubDate>Fri, 26 Aug 2005 23:02:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:456877</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/456877.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=456877</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=456877</wfw:comment><description>&lt;DIV&gt;In my previous posts, I discussed the concepts of non-random selection and arms races. With this understanding in mind, we can start to see a very important concept arise.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Accurately defining the unit of selection is absolutely critical to effectively evolving your software.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;It is all too easy to introduce pleiotropy into software. Pleiotropy in biology is the ability for a single gene to cause multiple phenotypes. For example, some cats have a particular allele which makes them white, and they&amp;nbsp;also happen to be deaf. Other alleles may cause changes in the face, ears, and hair - all from a single gene. In software, this principle generally expresses itself as unintended consequences - generally these are bugs.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Once we put these two concepts together, we&amp;nbsp;gain some very important architectural learning, which will come as no surprise whatsoever to any software engineer: you want to design your software so that modification to one system do not adversely produce unintended side effects. While it seems quite obvious, it is not always that straightforward to execute on this.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Furthermore, you would also like to architect your system to enable you to select into the next generation of your software those implementations which have given your software an evolutionary advantage, while mutating those elements of your system which have not given your software the competitive advantage you had hoped for.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;A highly visible illustration of this notion comes with the separation of user interface from business action. This is a pattern that has been around for quite some time, most commonly known as model-view-controller. As the user interface is something that is highly likely to change (this is the component that is most visible to the user, and tastes tend to change over time), you would like to be able to select those UI elements which make your software easy to use, while mutating those that are not comprehensible enough to surface a given functionality. Nonetheless, it is very common to see many logical dependencies sitting inside of UI code. How many web development projects on ASP.NET, for example, have you seen with business logic code sitting in the code-behind for a given page? How many button handlers in a Windows Forms application immediately take some action right inside of a form user interface object?&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;Once you begin thinking of your software with the goal of explicitly defining your units of selection, you can begin to architect your solution to achieve this goal.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;As an example, consider the architectural changes being undertaken for Windows Vista. Larry Osterman provides a description of this layered architecture &lt;a href="http://blogs.msdn.com/larryosterman/archive/2005/08/23/455193.aspx"&gt;here&lt;/A&gt;. In essence, what the Windows team has created is a system of layers wherein components in a given layer should not have a dependency on components in a higher layer. The lower a component lives in this layered system, the higher the probability of pleiotropy. As a result, low level components have been set up to evolve slowly and deliberately, as changes will involve extensive testing of all higher-layer components. Components that sit in a high layer are now able to evolve far more quickly, because there is a much smaller chance of pleiotropy in these components. Not only can they do so, but there is an implicit statement that they both want and expect to do so.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;This is an appropriate starting point to architecting your software to evolve - explicitly defining those components which can be rapidly evolved. And, in any reasonably complex software project, this is an important statement to make, and it should be explicit.&lt;/DIV&gt;
&lt;DIV id=CSBloggerSig&gt;&lt;/DIV&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=456877" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Selection and Evolutionary Arms Races</title><link>http://blogs.msdn.com/cjacks/archive/2005/08/15/451858.aspx</link><pubDate>Mon, 15 Aug 2005 17:13:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:451858</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/451858.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=451858</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=451858</wfw:comment><description>&lt;P&gt;Selection is the basis by which evolution can develop the enormously complicated systems that exist today. The underlying principal is non-random selection of random mutations. In any organism, there is some degree of genetic drift. Despite the built-in error correction of DNA replication, mutations still occur, and they occur randomly. I should be explicit here - they occur randomly with regards to the phenotype of that mutation. Some mutations will produce a phenotype that is non-selective. In other words, an individual who possesses that phenotype is neither more likely nor less likely to survive. Other mutations will be beneficial, increasing the likelihood of survival. Still others will be detrimental, decreasing the likelihood of survival. Because selection is non-random, it will usually select the phenotype that has superior survivability characteristics. So, a single mutation that increases, however slightly, the ability of an individual to survive, it is far more likely that the organism will, in fact, survive to reproductive age, and that this mutation will continue into the next generation.&lt;/P&gt;
&lt;P&gt;For example, assume that an organism randomly develops a mutation that makes a cell on its face more sensitive to light. That sensitivity allows it to see, however, vaguely, that there is a shadow moving overhead, and run away. Such an organism is likely to have sufficiently higher survivorship than its companions, be more likely to reproduce, and eventually this selection will drive the majority of the population to have this mutation. Each successive mutation that refines this detection will be selected for, and eventually you end up with an eye.&lt;/P&gt;
&lt;P&gt;This has a couple of assumptions, mind you. The most important one: that there are too many organisms for a given environment. A given group of organisms will produce too many offspring to ensure the survival of all of them. As a result the remaining organisms must compete to survive. Because only some of them survive, most often it will be those organisms who possess the superior phenotype - filtering out mutations that are less beneficial. The same is also true between species: if two species are competing for the same set of limited resources, then mutations that either make them more effective at competing, or else mutations that cause them to seek out a different (and less hotly contested) set of resources will be selected.&lt;/P&gt;
&lt;P&gt;The idea of selection is just as important for software. This process is what drives the continued evolution of software, just as it drives the continued evolution in the natural world. The only difference is that mutations are non-random (or, at least, most of them are, we hope!), but selection is the process that determines which characteristics will survive into the next generation of software, and which will not.&lt;/P&gt;
&lt;P&gt;The importance of selection can be seen in a couple of ways. The first is in determining the evolutionary rate. Consider, first, the amount of competition for a limited set of resources. In this case, these resources will likely be either money or time. If there is a great deal of competition, then your software is likely to evolve fairly quickly. If there is not much competition, then the evolutionary rate is likely to slow. Mutations may still take place, but without a set of selection criteria these mutations will not survive based on merit, but will simply be part of the genetic drift.&lt;/P&gt;
&lt;P&gt;You can find an excellent example with web browsers. Once people began to realize that they really needed to have a web browser, then resources opened up to consume them. However, this set of resources was limited: people generally were content to have a single web browser. So, the competitors were vying for this resource, to become the choice. This spurned rapid innovation in the web browser market, as competitors struggled for their very survival against other web browsers. Eventually, one browser came to be the victor, consuming the vast majority of the resources (installed desktops) available in the market. The competition died down, and so did the innovation. Even assuming that development continued at the same rapid pace on the product, without contention for resources, there would have been no way to select specifically for those features which are of the greatest tangible benefit to users, against those features that are pure genetic drift. The product simply cannot evolve as quickly.&lt;/P&gt;
&lt;P&gt;However, once competition emerged again, selection could come back into play. Features could be added to enhance the product, and those features that drove adoption would be selected for, driving evolution for more rapidly. Non-random selection is critical.&lt;/P&gt;
&lt;P&gt;It is important to keep in mind, however, that all mutations in a software organism that the environment is currently selecting for are not necessarily the unit of causation of that selection. If you come up with a new browser that supports tabbed browsing, a new security model, and the ability to render all pictures in shades of purple instead of full color, can you necessarily say that rendering all pictures in shades of purple were instrumental in the success of the product? In much the same way, can you say that tabbed browsing is a mutation that is selected for, or was it actually security that was selected for? It is important not to conflate the issues; the environment, given a degree of competition, will provide selection. However, you can not always interpret any one factor as being key to the success of an adaptation.&lt;/P&gt;
&lt;P&gt;An even more extreme scenario comes with evolutionary arms races. Competition for a limited set of resources drives rapid evolutionary change. Competition for immediate survival compels even more rapid evolutionary change. Consider the evolution of the cheetah and the antelope. The antelope will evolve to run away more quickly. An antelope with a mutation that will allow it to run a little bit faster will have a much greater chance of survival into the next generation. By the same token, a cheetah with a mutation that enables it to run a little bit faster will catch more food, thus increasing the chance of this mutation surviving to the next generation. Here, the success of one organism definitively means the failure of the other. This is an arms race: in this scenario, evolution takes place very quickly.&lt;/P&gt;
&lt;P&gt;The most direct analogy here is towards security. Software that evolves to be more secure is more likely to survive. So, the attackers become more sophisticated, creating new malicious software able to overcome these obstacles. Both sides evolve their software more rapidly than in any other evolutionary environment. But, in this case, there is some degree of hope for everyone who is trying to develop secure software while facing attacks from continually more sophisticated attackers. To quote Richard Dawkins, "if the predator loses the race, he simply loses a meal. If the prey loses the race, he loses his life." The pressure is on everyone developing software to make it more secure, or they will lose altogether. Thus, it seems likely that many will be able to maintain this accelerated evolutionary rate, because they simply have no choice.&lt;/P&gt;
&lt;P&gt;Understanding the environment your software is critical to guiding its evolution. What are the resources? Who is competing for them? And what features are actually being selected for? This is a powerful principal that has done amazing things with nature, and can do similarly amazing things for software when they are leveraged.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=451858" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Evolution vs. Revolution</title><link>http://blogs.msdn.com/cjacks/archive/2005/07/26/443401.aspx</link><pubDate>Tue, 26 Jul 2005 12:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:443401</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/443401.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=443401</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=443401</wfw:comment><description>&lt;P&gt;In my previous posts, I have been arguing the point that throwing away source code and starting over from scratch is a notably bad idea in general. In this, I am echoing what Joel Spolsky says so eloquently in his post &lt;A href="http://www.joelonsoftware.com/articles/fog0000000069.html"&gt;Things You Should Never Do&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;(Incidentally, I just observed the zero-filled naming convention that Joel is using in the URLs of his posts. This is post number 0000000069. Either Joel is expecting to be astonishingly prolific, on the order of tens of thousands of posts per day, he is expecting to be outlandishly long lived, or he really likes the number zero.)&lt;/P&gt;
&lt;P&gt;I think it would be unfortunate to interpret this sentiment as being absolute in any way. Quite simply, there are some situations where it is, indeed, appropriate to completely write from scratch. The only question is, how do you decide?&lt;/P&gt;
&lt;P&gt;A good example to consider is the revolution that took place in aircraft engines. Originally, we used propellers, and now we have the option of jet engines. The move to jet engines was anything but an evolution. Rather, their designers began with a clean slate. In fact, had they worked under the constraint that they must begin with a propeller engine, and gradually make changes until it became a jet engine, it would have been much more costly overall. Biological life has the constraint of natural selection. You begin with a certain organism, and you can gradually evolve it. At each step, the organism must have equal or better survival characteristics as the step before it. Had we applied the same constraint to jet engines, then engineers would have been required to develop a series of engines evolving the propeller to the jet, each of which would succeed in the market place.&lt;/P&gt;
&lt;P&gt;Fortunately, they did not operate under that constraint. Your software does not need to either. So, evolution and revolution are both tools that you have available to work with the software organism, and making intelligent decisions regarding the best approach is absolutely critical.&lt;/P&gt;
&lt;P&gt;This truly is the trick, then. If the code you are writing actually exhibits an identical phenotype as code that already exists, then you are fundamentally wasting your time. Observe what happens, in the world of natural selection, when two genes exhibit identical phenotypes. The genes that are present in the majority of organisms, no matter how small that majority is, will eventually have a dominant position in the gene pool. By creating new software DNA to exhibit the same phenotype, you are most likely wasting your time because you not only divert resources from creating new functionality, but that code is doomed to obscurity and eventual demise. You lose an opportunity for achieving immortality through your software - and isn't that part of the fun?&lt;/P&gt;
&lt;P&gt;Propellers have fundamental limitations that justified the investment in developing jet engines. No matter how carefully the designers worked to evolve the design, they simply could not achieve the results they needed without taking a different approach altogether. Is there a fundamental limitation in the nature of the software you are creating? Can it be evolved and refactored, or should it be completely reengineered? And, if you need to engineer a different solution altogether, you should choose the smallest necessary piece to re-engineer. With the advent of the jet engine, parts of the plane needed to evolve to support the new engine design, but the concept of the plane itself could be evolved, rather than re-developed from scratch.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=443401" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Evolving an Imperfect Design</title><link>http://blogs.msdn.com/cjacks/archive/2005/07/08/436850.aspx</link><pubDate>Fri, 08 Jul 2005 13:44:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:436850</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/436850.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=436850</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=436850</wfw:comment><description>&lt;P&gt;I continue to be surprised by suggestions that an entire body of code - one which has proven its ability to survive in the software ecosystem, should be completely disposed of and replaced with new, less "buggy" code. I read another treatise on this recently, and I still fail to understand the logic. Why would humans, with the same human imperfection, suddenly start writing completely bug-free code now? Instead, we should evolve this imperfect design.&lt;/P&gt;
&lt;P&gt;As an analogy, consider the human eye. If you look at the structure, what you will find is exactly the opposite of what the typical engineer would expect (and, in fact, exactly the opposite of what exists in cephalopods): the light-sensing portion of rods and cones is actually tucked away behind several neuronal layers, as well as the cell's own nucleus.&lt;/P&gt;
&lt;P&gt;&lt;IMG title=http://upload.wikimedia.org/wikipedia/en/2/21/Fig_retine.png alt="Retina's simplified axial organisation. The retina is a stack of several neuronal layers. Light is concentrated from the eye and passes across these layers (from left to right) to hit the photoreceptors (right layer). This elicits chemical transformation mediating a propagation of signal to the bipolar and horizontal cells (middle yellow layer). The signal is then propagated to the amacrine and ganglion cells. These neurons ultimately may produce action potentials on their axons. This spatiotemporal pattern of spikes determines the raw input from the eyes to the brain." src="http://upload.wikimedia.org/wikipedia/en/2/21/Fig_retine.png"&gt;&lt;/P&gt;
&lt;P&gt;(The image above taken from wikipedia.org, and has been released to the public domain.)&lt;/P&gt;
&lt;P&gt;One consequence of this design is that the light-sensing portion of these cells need to be more sensitive in order to process a smaller amount of light, given that this light is filtered. This design has evolved to compensate for a shortcoming in the original design. It was simply more expensive to replace the existing design than to make this modification, in order to achieve phenotypically similar results.&lt;/P&gt;
&lt;P&gt;Another consequence is that these nerves must find their way out of the eye, which entails going through the layer of light-sensitive cells. That leads to a blind spot, and the eye can't compensate much for this on its own. Here, another system entirely - the brain - is the compensating agent. This is important to consider, particularly in large enterprises with existing infrastructure.&lt;/P&gt;
&lt;P&gt;When you are faced with designs that are less than perfect, throwing them all away is certainly always an option, and often times it will be the most expensive option. So, when considering your software organism, there is always the potential to evolve that system in such a way to compensate or make irrelevant that design shortcoming. There is additionally the option of using system integration to allow another system to handle the compensation. Of course, if your systems are not designed with integration in mind (which is, of course, a shortcoming that additional evolution can address), then your options are limited.&lt;/P&gt;
&lt;P&gt;In every case, considering how to evolve software is a complex decision. Yes, completely beginning again is always an option. Evolving the existing software is another option. Compensating with a connected system is another option. In fact, this analogy (the ability of disparate systems to serve as the point of evolutionary adaption) only begins to scratch the surface of the power of integration.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=436850" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Single Step Selection</title><link>http://blogs.msdn.com/cjacks/archive/2005/06/30/434377.aspx</link><pubDate>Fri, 01 Jul 2005 00:06:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:434377</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>1</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/434377.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=434377</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=434377</wfw:comment><description>&lt;P&gt;Have you ever pondered about some really amazing feature of the biological world? The eye? The ear? The sense of touch? Bird flight? These are features evolved very gradually, over many generations. The net result was something that seems incredibly impressive. Sometimes so much so that it is hard to imagine that something so impressive could have evolved at all, yet they did.&lt;/P&gt;
&lt;P&gt;None of these features evolved through single step selection. A group of eyeless creatures did suddenly had offspring that had an impressive eye, complete with a focusing lens, a retracting iris, and appropriate neural connections. It is far more likely that one cell happened to be more light sensitive, so it evolved to have even more light sensitivity because that provided a competitive advantage. Eventually, a mutation came along that had two light-sensitive eyes. This gradual and naturally selected process continued until the eyes that we see today in all sorts of creatures had developed. (This, incidentally, happened more than once.) Of course, it is theoretically possible for something as complicated as an eye to evolve in a single generation. However, the odds against that happening, and happening in a way that is beneficial to the organism, are incredibly small. In fact, there are mechanisms in place to prevent such massive changes in DNA structure, because most massive mutations are not good things.&lt;/P&gt;
&lt;P&gt;The same is true of software. Most great software has evolved over many versions. Seldom is it exactly right the first time. Of course, a version of software is a somewhat arbitrary measure, because release criteria is not always equivalent. One developer may release 30 beta versions over 3 years before finally declaring a build V1.0. Another developer may release just two betas over 6 months before declaring a build V1.0. Therefore, we cannot take the build number at face value. The idea, however, is still the same. You are probably not going to get your software right the first time. You will release it into the wild, and all of a sudden, you will find a number of things wrong with it. In response, you evolve and refactor your software (assuming that it survives the first release), and the second release is generally better because we have directionally (as opposed to randomly) evolved it with a particular phenotype in mind.&lt;/P&gt;
&lt;P&gt;Avoiding single step selection is a good idea. You wouldn’t want to just sit down with a text editor and start writing code (which will generate digital DNA) for an operating system, proceed until you have implemented everything, and then compile and expect that, in a single step, you will get exactly the phenotype you are hoping for.&lt;/P&gt;
&lt;P&gt;This has behavioral implications. This is why a daily build is such a good idea. You generate a functioning organism every single day, and you can measure its phenotype against the expected phenotype, incorporating necessary mutations in the work that follows. It also makes beta releases a good idea. You are placing a particular set of DNA into an environment that more closely represents the “wild” of released software, so that again you can measure the actual phenotype against expectations.&lt;/P&gt;
&lt;P&gt;I am a huge proponent of Microsoft’s latest beta approach – the Community Technology Preview. This puts the organism in the hands of as many people as possible, so that natural selection determines which features make it into the final version, which features mutate, and which features disappear entirely. Allowing the actual environment to influence mutation earlier and more frequently helps ensure that the organism is as viable as possible when you release the final version of your software.&lt;/P&gt;
&lt;P&gt;Avoiding single step selection is also a compelling argument to avoid complete re-writes of software. Occasionally, a fresh developer will look at the code and suggest that they should lose all of the existing code and start again from scratch. After all, that code is old and full of messy bug fixes. What they apparently do not understand is that those mutations were necessary for the software’s survival. While another mutation might exhibit the same phenotype, how likely are you to find that more eloquent mutation on the first try? It is critical to leverage the value of the mutations to the software organism’s survival, and continue to mutate with future survival in mind. It is best to avoid mutation simply because it is old. In software, as in biological life, genes that have survived for quite a long time are very likely to be genes well adapted to survival. Why would you want to lose them?&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=434377" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Mutation and Genes</title><link>http://blogs.msdn.com/cjacks/archive/2005/06/23/431968.aspx</link><pubDate>Thu, 23 Jun 2005 21:05:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:431968</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>0</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/431968.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=431968</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=431968</wfw:comment><description>&lt;P&gt;From the comments I received, it is apparent that I rushed through my description of mutation, which seems to have led to some confusion. I will attempt to rectify that shortcoming.&lt;/P&gt;
&lt;P&gt;When I speak of mutation being non-random in biological life, there are a couple of ways to think of this. First, consider spontaneous mutation – a piece of the genetic code transforms itself from one message to another: we add a new nucleotide, or we remove an existing nucleotide. Assume no external influence. The rate at which these bits change depends entirely on what these nucleotides are. Some portions of that genetic code are more likely to change than others, due to the inherent stability of the molecules and the degree to which these modules really like being next to each other. Every nucleotide is not equally likely to change in an event of spontaneous mutation as any other nucleotide. Thus, it is not random. Now, you can also consider external factors. If the new combination is particularly unstable, that mutation may not last long at all, and instead trigger additional mutation. That mutation may produce a phenotype that is incapable of survival. These are also highly non-random.&lt;/P&gt;
&lt;P&gt;What is seemingly random is the phenotypic response to a particular mutation. It is convenient for the molecule itself to mutate – this does not occur for the benefit of the organism. A nucleotide sequence in DNA will mutate purely because it is more unstable than another one. It does not matter to the forces governing the mutation whether the resulting organism will benefit from this mutation or not. There is no intelligent design for this mutation. The DNA sequences governing eyesight, for example, are not going to mutate purposefully in order to improve that phenotypic response. If eyesight improves, it will be the result of random mutation, and disproportionate survival of organisms whose eyesight turned out better because of these mutations (natural selection).&lt;/P&gt;
&lt;P&gt;Mutation in software occurs primarily with explicit intent. (Yes, there are examples of viruses and other malicious software modifying the underlying software instructions, in much the same way that a gene researcher may insert a new sequence into an existing biological organism.) A jnz instruction will not mutate spontaneously into a jne instruction, for example. (A particular instance of this software – one cell – may do that in the event of hardware failure, but the software itself does not.) The creator, and the overseer of this organism, will change the underlying code with the intent of creating a phenotypic response that increases its chance of survival.&lt;/P&gt;
&lt;P&gt;Given the concept of intelligent design and purposeful mutation, we can consider how we want to go about mutating our software (since I believe we can all agree that software is not yet perfect). This brings up another reader question – what is the boundary of the gene, as opposed to the entire body of DNA for a piece of software? The truth is that this is something that is under the shared control of both our tools and us, as software developers. The reader suggested something like a class or a struct. While there certainly are some elements of the analogy that strongly suggest this, my tendency is to disagree. Why? This is an artifact of the tools we happen to use for intelligent design of software DNA. I create a class because it makes it easier for me to create my software. The design of my class drives the creation of a particular type of software DNA only when I am using the exact same compilers in the exact same environment. If I use the same compiler for that class, I will end up with the same DNA. If I use a different compiler, I may end up with completely different DNA that exhibits a completely different phenotype. For example, I may compile some C++ code using a compiler optimized meticulously for one particular processor. The result will – presumably – be code that runs faster on that processor. If my intent is to create a high performance library that only needs to run on that processor, the same input created a vastly different expression with vastly different survival characteristics.&lt;/P&gt;
&lt;P&gt;To me, a gene is a segment of binary code – a unit of deployment, if you will. I have seen (poorly written) software that embeds a huge amount of logic into a single logical unit of deployment – both at the human-readable source code level and the actual binary DNA level. Others more carefully analyze their software to ship them as a bundle of genes that can re-use each other and make the process of evolution more straightforward.&lt;/P&gt;
&lt;P&gt;However, since developers are predominantly human, then it is probably more useful to think in terms of the genes as having some meaning at the source code level. Even though I do not personally believe this is the best analogy, it is far more useful in a world where very few people write their code in 1’s and 0’s.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=431968" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Terminology and Non-Random Mutation</title><link>http://blogs.msdn.com/cjacks/archive/2005/06/20/430783.aspx</link><pubDate>Mon, 20 Jun 2005 18:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:430783</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>3</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/430783.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=430783</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=430783</wfw:comment><description>&lt;P&gt;I want to take a moment to go back and review some of the terminology I have been using, to ensure that there is no confusion. The reader will kindly indulge any ambiguity in my language up to this point – I am quite literally making this up as I go along.&lt;/P&gt;
&lt;P&gt;&lt;FONT size=4&gt;&lt;STRONG&gt;Binary Code == DNA&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;In the analogy I have been using, binary code represents the DNA of a software organism. Why the binary code, rather than the original source code, or the diagrams that you used to design the source code? The binary code does the work. It drives the expression of the phenotype. The source code, and any documentation that guided its creation, is an artificial construct used to generate this particular binary code. (I will speak more on these constructs later.) At the most basic level, consider the fact that you can make a copy of software using the binary code only.&lt;/P&gt;
&lt;P&gt;&lt;FONT size=4&gt;&lt;STRONG&gt;Single Installation of Software == Cell&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;Given a set of DNA, you now host that encoded information in an environment where it can develop. This environment is a single installation of a computer somewhere. That environment provides the means of expressing phenotype, and of survival itself for that cell. For example, consider a scenario where you write a software application that depends on the .NET Framework. You, therefore, depend on the expression of that particular set of DNA in order to operate. Now, how you draw this analogy is a matter of some debate. Since this is also DNA in that particular cell, it really is not different from the DNA in your application. In other situations, you may consider some DNA analogous to mitochondria, where it survives independently and provides critical services to the entire cell. This really is a detail of implementation – the cell has two distinct mechanisms contained inside of one barrier. We will not concern ourselves with perfecting our analogy to this point. What is critical is that you begin with a set of DNA, and it must work in concert with the other DNA in that “cell” in order to express itself or even to have the cell survive. It may depend on other DNA being there and expressing itself, and other DNA being there and expressing itself may adversely affect it.&lt;/P&gt;
&lt;P&gt;&lt;FONT size=4&gt;&lt;STRONG&gt;Entire Installed Base == Organism&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;While a single installation of software must cooperate with, compete with, or ignore the other software DNA on a particular installation, the entire installed base must concern itself with the entire ecosystem of survival. Will it meet with broad acceptance and acquisition, or will it drift away into obscurity? This delves in to issues such as economics and emotional reaction to the software. The ability to behave well on a single installation does not guarantee survival, just as having perfect cells may not help you much when you happen to be sitting in a room full of hungry leopards.&lt;/P&gt;
&lt;P&gt;Note that I am not so bold as to suggest that a SKU is the best way to define where one organism stops and another begins. A single software organism may consist of several products combined in some sort of useful and interesting way.&lt;/P&gt;
&lt;P&gt;&lt;FONT size=4&gt;&lt;STRONG&gt;Selection&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;In my analogy, selection at the cellular level determines the extent to which any organism can grow. If software does not operate in a majority of installations, the organism itself will remain small. (Say, for example, that the software only works on an obscure operating system, and must be paired with an extremely expensive companion software package.) Selection at the organism level determines the extent to which a viable set of DNA, perfectly able to grow and operate in a number of environments, will actually be able to compete with other software organisms for acquisition and use.&lt;/P&gt;
&lt;P&gt;&lt;FONT size=4&gt;&lt;STRONG&gt;Mutation&lt;/STRONG&gt;&lt;/FONT&gt;&lt;/P&gt;
&lt;P&gt;When you think of true Darwinian evolution, you must take into account the idea of mutation. This mutation is decidedly non-random. Rather, it is dependent on the laws of physics. Certain molecules will change at measurable rates. Combinations of molecules will change at measurable rates. (For example, this is how we can determine the half-life of a given molecular structure.) These mutation rates differ between different sequences of molecules. In addition to varying rates of spontaneous mutation, there are also differences in the success rates of mutations. For example, the rate of variation in the histone gene is remarkably small across all eukaryotes specifically because variation in this gene is extremely maladaptive. (The protein it provides the recipe for plays a pivotal role in gene regulation, as well as forming the spools around which DNA winds.) However, the phenotypic expressions of gene mutation are random. If one sequence of DNA mutates at a rate of once every 10 years, this mutation will occur whether or not that mutation gives rise to either perfect eyesight or a complete lack of a liver. (Not that either one is likely to result from a single mutation.)&lt;/P&gt;
&lt;P&gt;Software mutation, on the other hand, is not at all random concerning phenotype (although human imperfection certainly makes it seem like this is the case at times). Software mutation takes place with the explicit purpose of creating a new, and supposed superior, phenotype. This is an important differentiation between software and our biological analogy. While we are still somewhat concerned with error checking to determine the health of our DNA, this is explicitly in response to parasitic modification in a particular cell. We do not to regulate our software’s own tendency towards spontaneous mutation with random phenotypic results. We guide the evolution of software with intelligent design.&lt;/P&gt;
&lt;P&gt;So, how then do we guide the evolution of our software to take into account that which parallels biological life, but also that which is fundamentally different (intelligent design)?&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=430783" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>On the Nature of Software Organisms and Selection</title><link>http://blogs.msdn.com/cjacks/archive/2005/06/17/430170.aspx</link><pubDate>Fri, 17 Jun 2005 17:14:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:430170</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/430170.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=430170</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=430170</wfw:comment><description>&lt;P&gt;In my &lt;a href="http://blogs.msdn.com/cjacks/archive/2005/06/16/429757.aspx"&gt;last entry&lt;/A&gt;, I attempted to illustrate (hopefully with some degree of success) the reasoning behind viewing software as an organism, and all of the associated learning we may gain from such a comparison. In this entry, I am hoping to clarify this analogy a bit more, in order to provide for us a launching point to leverage this analogy more productively.&lt;/P&gt;
&lt;P&gt;The aspect I hope to clarify specifically is the boundaries of the organism, as well as the boundaries of taxonomy. What would we consider a single instance of a software organism? What defines a species? This does influence how we are able to draw some conclusions, so I believe this exercise truly is important.&lt;/P&gt;
&lt;P&gt;Consider, for example, Microsoft Word (which I am using to author this post). The underlying DNA behind this application is the binary code for the 2003 version, Service Pack 1, with all of the latest patches applied. Does this particular instance of Microsoft Word represent an organism, or do all instances of a particular version (at the micro-level, meaning that the next time a patch comes out I will have a new version), put together, represent a single organism?&lt;/P&gt;
&lt;P&gt;The best logical argument I can come up with will classify the entire collection of instances of a particular version as a single organism. Using our analogy, consider the human body. It consists of a large number of cells, each containing the same DNA. These cells, depending on their location, will exhibit a phenotype that depends on the chemical environment within and around that cell.&lt;/P&gt;
&lt;P&gt;In a similar way, one instance of Microsoft Word may be trying to operate in an environment where it cannot survive. (For example, an operating system other than the one the developers targeted.) It may be operating in an environment where it does not perform well (such as an instance running on a very busy e-commerce server). It may be running on a computer where a virus has changed its binary code – literally modifying its DNA so it exhibits a different phenotype. The overall health of the organism will not necessarily harm the organism itself, until such time as inflexibility to variations in the electronic environment cause people to stop acquiring and using the product, eliminating it in a process of natural selection in favor of a superior alternative.&lt;/P&gt;
&lt;P&gt;It is convenient that this classification also happens to be very useful. By considering every instance of a version as an organism, we can then consider other instances of the same species (in this case, a rival word processor) as well as ancestry (previous versions of the same application).&lt;/P&gt;
&lt;P&gt;It also offers us the opportunity to measure success – perhaps using sales figures, download rates, and lifespan. We gain the concept of selection. When a developer releases a version of software, that software organism grows to a particular size. The nature and rate of growth of that organism determines if the developer creates another version (organism). It also may give rise to competitive organisms, which seek the same resources (money) that the existing organism is consuming.&lt;/P&gt;
&lt;P&gt;This provides a strong analogy to evolution. Genetic code. Selection. Mutation. Embryology (the environment in which the organism grows). Assuming that we agree on this as a starting point, it’s probably about time to start leveraging this analogy productively rather than continue to strengthen the case for using it.&lt;/P&gt;
&lt;P&gt;I have had a couple of comments regarding where I am going with this. All have suggested generic algorithms, which are interesting, but this was not where I was originally heading. (Of course, now I feel almost obligated to head there at some point.) To Ralf, I say yes – I do intend to explore how we could leverage this to build the next ERP system. :-)&lt;/P&gt;
&lt;P&gt;In addition, somebody contacted me because he was unable to leave a comment on the blog itself. For now, I have enabled anonymous comments, and we will see how that goes. &lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=430170" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item><item><title>Software as an Organism</title><link>http://blogs.msdn.com/cjacks/archive/2005/06/16/429757.aspx</link><pubDate>Thu, 16 Jun 2005 15:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:429757</guid><dc:creator>Chris Jackson</dc:creator><slash:comments>4</slash:comments><comments>http://blogs.msdn.com/cjacks/comments/429757.aspx</comments><wfw:commentRss>http://blogs.msdn.com/cjacks/commentrss.aspx?PostID=429757</wfw:commentRss><wfw:comment>http://blogs.msdn.com/cjacks/rsscomments.aspx?PostID=429757</wfw:comment><description>&lt;P&gt;Can we correctly describe software as an organism?&lt;/P&gt;
&lt;P&gt;I believe that we can make a compelling argument to do exactly that. To achieve this, I first intend to run through analogies that will describe some of the correlations between software and biological life, which may help to explain why we would want to endeavor on such an exercise in the first place. If we can agree on this, then we can explore some more compelling arguments to use this terminology, which I hope will lead us to some conclusions that will forward the way we think about, design, and build software.&lt;/P&gt;
&lt;P&gt;When you run software, you experience to the phenotype generated by the software code. Therefore, to use biological terms, the source code is analogous to DNA, and the executing binary is analogous to the physical manifestation of that DNA after it develops in the environment where that DNA happens to be situated (typically an egg cell of some size and shape).&lt;/P&gt;
&lt;P&gt;Of course, this may be a flawed analogy. DNA, after all, is more of a recipe than a blueprint. Source code may be more of a blueprint than a recipe. For example, there is no DNA dictating exactly how to build an eye, which we could remove, insert into an egg cell, and grow only that eye. Rather, the DNA tells the original egg cell to divide in such a way that there is a slight difference between the two resulting cells chemically. In these two new cells, the slight differences trigger the reading of slightly different DNA strands from this cookbook, producing two cells each that are, again, slightly differentiated. This process continues until a single cell has been chemically prepared enough to be the precursor to an eye, which enables the reading of the DNA that specifies the design of that eye. In effect, you have a recipe for creating the chemical environment necessary to generate an eye and read additional components of the DNA that direct any variations in eye design that the chemical variation is designed to support.&lt;/P&gt;
&lt;P&gt;We generally conceive of software, on the other hand, as much more of a blueprint. A menu item exists because software code specifically dictates that the computer should draw a menu item there, with the following attributes. However, this way of thinking about it may be too simplistic. How many times have you had one computer operation work repeatedly, but suddenly, on one occasion, this operation no longer works? Maybe the computer does not draw that menu item for some reason. (We can seldom explain that reason without a healthy dose of knowledge and some time with a debugger.) The phenotype of that software has now changed because of changes in the electronic (as opposed to chemical) environment surrounding that software! At commercial software companies, we see this sometimes with bug fixes. We fix one piece of software, which may fix one problem but alters the electronic environment for all other software. Suddenly, this other software (which depended on a particular electronic environment – which it may or may not be aware of) stops working and begins to exhibit a different phenotype despite no change whatsoever in the underlying source code.&lt;/P&gt;
&lt;P&gt;Of course, this is not nearly enough evidence to consider software itself an organism. Rather, most could probably agree that we can define life as something that is able to perpetuate itself. DNA is the basis of all known life precisely because it is so efficiently and accurately able to replicate itself. To some extent, we can see some software that is able to replicate itself – think of a computer virus. The problem with computer viruses is that they are so very efficient at replicating themselves. However, you do not typically think of a program such as Microsoft Word replicating itself wildly. If we dig a bit deeper, however, we can see a better comparison. DNA, in and of itself, really is not that terribly useful. As soon as you introduce enzymes which are able to read that DNA and duplicate it, then you have a powerful self-replication system. (You further need the ability to read the DNA, create RNA, and generate proteins if you want that DNA to exhibit a phenotype. Otherwise, life as we know it would be nothing more advanced than a huge number of strands of DNA floating around in the primordial ooze.) These enzymes are an agent. Another example of an agent providing the means of replication is with an actual (not computer) virus. Many times, they are nothing more than a simple strand of DNA, optimized for entering host cells and utilizing their resources to replicate. They cannot replicate without a host cell. Most software, similarly, does not replicate itself. However, you can use a host (such as a CD burning facility, or a web site) to generate copies of the source code, and thereby spawn additional instances of that phenotype.&lt;/P&gt;
&lt;P&gt;So, at its root, both DNA and source code are a code (one digital with 2 permutations, the other with 4 permutations) that can be read in a certain environment to exhibit a phenotype, and furthermore can be replicated to perpetuate their own lives over time. To me, that means that we have a kind of non-biological organism. Conceiving of software in this way allows us to open our minds to many of the things that we have discovered in the biological realm, which we can potentially leverage to improve our analogous software. Most interesting to me is the concept of evolution.&lt;/P&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=429757" width="1" height="1"&gt;</description><category domain="http://blogs.msdn.com/cjacks/archive/tags/Software+Evolution/default.aspx">Software Evolution</category></item></channel></rss>