I was discussing the notion, the other day, that a defect in design may be expensive, but a defect in the fundamental assumptions of a project can be catastrophic. In other words, if you are doing the thing wrong, you can fix it, but if you are doing the wrong thing, you get to start over.
So when, in a meeting on Enterprise Architecture, the speaker asked the audience if anyone has ever delivered a project only to find, a short time later, that it failed, I was not surprised when a good percentage of folks raised their hands. We've all been on projects where we thought we were doing the right thing, and doing it well, only to find out after delivery that we had screwed up.
One thing that did surprise me, though, was one gentleman who mentioned that he had been on a project that delivered, and he didn't find out for six months that the project had failed... not because he was "out of the loop" or took a very long vacation, but because the customer didn't know that the project was a failure for that long.
That is scary.
It occurs to me that I haven't seen anything like that on agile projects. The hallmark of an agile project is that you stop, OFTEN, and show the results to the customer. Not the marketing person. Not the project manager... the customer. You get feedback. And you make changes. Change is embraced, not avoided.
So, if you are doing the wrong thing, it should be obvious early. In fact, it could become obvious so early that the team hasn't spent the lion's share of the original funds yet... still time enough to fix something and get back on track. This is Great! If you are going to waste money, find out early and stop. Then, reorient the investment. It is far worse to develop the entirely wrong application than it is to develop what you can of a good one.
That doesn't happen with waterfall projects. On the other hand, the waterfall project has the advantage of delivering the wrong thing. Teams get rewarded on the quality of the delivery, not the alignment between the delivery and the actual needs. Developers get gifts and good marks for "getting it done right" but not for "getting the right thing done". That won't be discovered until later, and then the dev team will deflect the blame to the analysts who collected the requirements.
And this works against the Agile methods. Even though Agile methods spend money better, they don't get to that end-date when everyone throws up their hands for joy and says "We Got It Done." They don't get the prize at the end that people crave: the promotion out of waterfall h_ll. The right to go home on time. The plasticine trophy for the windowsill.
So if you want to know why agile methods aren't fostered more often, or more closely, look no further than the "ship party" that roundly celebrates the delivery of a dead horse.
To this end, I propose a new practice for agilists: the kill party... where everyone celebrates when a bad idea is killed before it consumes buckets of shareholder's cash.
I suggest that we add temporal foreign keys to relational database design.
Programs move data. Databases store data in a consistent fashion. These different purposes can lead to different organizing principles. One of the key reasons for Object Oriented structures is to minimize cost and complexity when things change over time. This creates a temporal relationship between various designs, a relationship that is powerfully supported by object oriented development.
However, while programs hold functions that change over time, we don't have a good structure for isolating complexity caused by data that changes over time. And so I ask the question: How do we begin to create the notion, in the data storage layer, that data can hold a temporal relationship with other data.
I don't mean the notion that a data record would have a "last updated date". I mean the notion that a data table may contain a foreign key to another table, where a record keeps the values in the related table that existed on a particular date, even if the data in the related table is later changed.
For example: let's say that company A sells products. Their products are P1 and P2, and they sell for $20 each. Now, company B makes an order for product P1. In current RDBMS systems, we actively copy the price at the time of the order from the 'products' table to the 'purchase order details' table because the price could change later, and we want to remember the price on the date it was made.
However, this is a workaround. The fact is that the purchase order has a temporal relationship with the products table... a relationship that the notion of RDBMS cannot handle... so we copy fields around. The decision of what fields to copy belongs to the 'purchase order details' table, and the code becomes complex by the notion that specific fields have to be selected from that table instead of the products table when evaluating a product. It's a kind of "overlay". The relationship says: pick fields from the related table unless a field by that name happens to exist in the current table.
Should this relationship be defined by the 'order details' table? Shouldn't the owner of the data (the products table) decide what columns to expose as "time related" while other columns are not? I submit that this would put responsibilities where they belong and reduce the complexity of the data systems themselves.
I suggest this innovative feature for RDBMS systems: a temporal foreign key. The owner of a table indicates what fields are likely to change frequently and would have their data kept in a temporal structure, while other fields are not temporal (like a relationship with the bill of materials used for new shipments). Then, when a foreign key is made, by placing the product id into the 'order details' table, the date of the relationship is noted. Temporal field values are fixed at the value in place at that time. No reason to copy fields to another table.
The code in the calling systems would be much simpler, as would the designs of the databases themselves. The complexity of the database system would increase, but not moreso than other forms of referential integrity.
It is time to consider this kind of relationship as an innovation to the now 30 year old basic notions of relational databases.
The marketplace of ideas is an amazing place. When Microsoft came up with the notion of Remote Scripting (many years ago), the Netscape folks scoffed. At the time, folks looked at MS and said, "This is a war, and I won't use a feature from the big bad wolf!" The notion of asynchronously updating part of a web page, while powerful, lay dormant for years.
Sure, IE has kept the feature alive, but few folks used it. Then, as soon as the Mozilla/Firefox folks decided to go ahead and embrace the notion, then it becomes safe for the public to use. Only then is it "cross platform." Alas, the key was not to add the feature to our browser, but to add it to every browser. (interesting).
The success of Gmail, and a marketing campaign by a consulting company, have led to some visibility. There's a new marketing term for this long-existing technique: Ajax. Nice name. Marketing, they get.
The great thing for MS platform developers: Just as the term will be gaining steam, Microsoft will release ASP.Net 2.0, which looks to have built-in support for it. The product groups have come up with a competing name: Atlas.
So, special thanks to Jesse James Garrett for publicizing a feature of our new platform. If you want to know more about implementing Ajax, both in ASP.Net 2.0 and in .Net 1.1, see this paper by Dino Esposito on the MSDN site.
If you want to know more about Atlas, see this blog entry from scottgu
It is nice to be ahead of the curve.
A friend of mine pointed out an interesting post by Scott Hanselman that used a clever phrase: "having a High Bus Factor" which is to say: if the original developer of a bit of code is ever hit by a bus, you are toast.
The example that Scott gave was a particular regular expression that I just have to share. To understand the context, read his blog.
private static Regex regex = new Regex(@"\<[\w-_.: ]*\>\<\!\[CDATA\[\]\]\>\</[\w-_.: ]*\>|\<[\w-_.: ]*\>\</[\w-_.: ]*\>|<[\w-_.: ]*/\>|\<[\w-_.: ]*[/]+\>|\<[\w-_.: ]*[\s]xmlns[:\w]*=""[\w-/_.: ]*""\>\</[\w-_.: ]*\>|<[\w-_.: ]*[\s]xmlns[:\w]*=""[\w-/_.: ]*""[\s]*/\>|\<[\w-_.: ]*[\s]xmlns[:\w]*=""[\w-/_.: ]*""\>\<\!\[CDATA\[\]\]\>\</[\w-_.: ]*\>",RegexOptions.Compiled);
I must admit to having developed code, in the (now distant) past that had a similar high bus factor. Nothing as terse as the above example, thank goodness, but something kinda close. On two occasions, actually. I look back and hope that I have learned, but I'm not certain that I have.
The trick here is that I do not know the developer who follows me. He or she will know some basic and common things. The problem lies deeper... It is where my expertise exceeds the ability of a maintenance developer to understand my code... that is where the break occurs.
So how do we avoid this? How does a good developer keep from creating code with a High Bus Factor?
It isn't documentation. I have been using regular expressions for decades (literally) and the above code is wildly complicated, even for me. No amount of documentation would make that chunk of code simple for me to read or maintain.
Pithy advice, like "use your tools wisely" won't help either. One could argue that regular expressions were not being appropriately used in this case, and in fact, the blog entry describes replacing it because it wasn't performing well when larger files were being scanned. That isn't the point.
I would state that any sufficiently powerful technique (whether regex, or the use of an advanced design pattern, or the use SQL XML in some clever way, etc) presents the risk of exceeding the ability of another developer to understand, and therefore, maintain it.
Where does the responsibility lie for insuring that dev team, brought in to maintain a bit of code, are able to understand it? Is it the responsibility of the development manager? The dev lead? The original developers? The architects or code quality gurus? The unit tests?
Is it incumbent upon the original dev team to make sure that their code does not have a High Bus Factor? If so, how?
I'm not certain. But it is an interesting issue.
We have an easy notion of the data dictionary: a description of the data at rest in a OLTP system. But what about the data in motion? That's where the Business Event Schema comes in.
More than a simple XML schema, a business event schema is a description that contains the following elements:
This is, IMHO, a key deliverable for any architect attempting to describe a business process and how systems that are involved in that process can be integrated with one-another in a real-time fashion. This event-driven integration goes hand-in-hand with service oriented architecture, in that the systems involved are loosely coupled, with explicit boundaries, using well defined data schemas, and at a coarse-grained level of interaction.
A team member and I found an interesting problem yesterday that I thought I'd share. We found the problem by luck, and the fix was weird. Perhaps there is an easier fix out there.
The problem manifested itself this way:
We needed to build our five different components into different MSI files (don't ask). Each of the five components refers to one or two "base class" assemblies that are included in each MSI. Previously, we had a single solution for each component that creates the assembly and then builds the MSI. Most of the assemblies end up in the GAC.
We were running into problems where we would end up accidentially installing two copies of a base class component into the GAC.
Our solution was to create a single solution file that builds all of the assemblies and builds all of the MSI files. This way, we could use project references and we'd only get one version of a dependent assembly in any MSI file.
The MSI for installing Assembly A is very similar to the MSI for installing Assembly B, because A and B are very similar. They both inherit from the same base objects. The problem was this: After creating the new solution file, and carefully checking every MSI, it appeared that we had it right: MSI-A would install Assembly A, while MSI-B would install Assembly B.
We saved the project and checked it into version control. Then ran our build script. MSI-A would have Assembly A, and MSI-B would have Assembly A as well. Assembly B was not included in any MSI at all!
Opening the project back up showed that, sure enough, MSI-B was defined to use the project output from project A, even though we specifically told it to use B. Fixing the reference using Visual Studio didn't help. The moment we saved and reopened the solution, the MSI would once again show that it refers to the wrong Assembly.
When project B was created, the programmer made a copy of all of the files of project A, and put them into another directory. He changed the names a little and ran with it. It never occured to him to open up the Project file and change the Project GUID for the new project.
The project GUID is a unique id for each project. It is stored in the project file, but the solution files and the install projects use them as well. Since we had two projects in the same solution that used the same GUID, then VS would just pick the first project with that GUID when building the MSIs. As a result, we had two MSIs with Assembly A and none with Assembly B.
The answer that we went through was to open one of the two project files, in notepad, and change the Project GUID. Then, go through every solution file that referenced that project file and change the referencing GUID value. We had to be careful with our solution file that contained both projects, so that we left one project alone and added the other.
This worked. The effect was odd. I thought I'd post the problem and our solution in case anyone else makes the mistake of creating an entire project by copying everything from another project, and then putting them both in the same solution file.