In my last two blogs about software testing (Customers, Talents ) I tackled issues that I felt are more understood.  In this entry I want to throw out some ideas my team and I started thinking on today about estimating a true testing cost for new features or potential design changes.  IMO developer costs are well understood and relatively controllable over time, but if careful attention is not paid you can easily under or overestimate the amount of time and resources needed to test a product.


For arguments sake lets assume I’m talking about a new feature being added to the product you are testing.  I’m also going to assume that you have moved out of the dark ages of relying solely on testing by hand and have some way to automate some percentage of your testing.  Finally I’m going to assume that you don’t start the QA costing process until you have a good understanding of the feature that includes potential developer implementation designs.   No sense starting off blind. J How can you estimate the costs involved over time? Here is what we came up with so far:


I would propose that there are no truly non-subjective measurements that give you the whole picture when it comes to the cost of releasing high quality software.  If you have one let me know.   The following subjective (1 through 10) measures can be used up front to rate the test costs of potential features against one another.  Once you start writing a test plan, once you have a more detailed spec, and once coding begins you should try and refine these numbers based on more concrete evidence over time.   


Has Dependency Scale (1-10): Does this feature stand on its own or is it highly dependant on features that surround it for success?   Think of a 10 as the feature that requires something from every other feature in the product in order to work at all.   Think of 1 as the feature that depends on next to nothing.  With these scales I’ll leave figuring out the middle ground as an exercise to the reader.  


Is Dependency (1-10): If this code malfunctions will it bring the entire product screeching to a halt or does faulty code here only bring about the demise itself?  Are teams outside of your own dependant on this feature? How much handholding are they going to require?  How much testing coordination are you going to have to put in with these teams?  Overtime you can refine this with actual information about # of references to this code or # of calls made into the code while the product is running, etc. 


Importance (1-10): I know I know, you would never code a feature if it wasn’t important.  But just like everything that has a price I’m sure you could stack rank features in order of relative importance to the end users.  For example: You may really like intellisense features, but if you had a choice between them and being able to run/compile/debug your code in VS I’m sure you would pick being able to run/compile/debug your code over these features.   So get over yourself and think about how relatively important feature Y really is.  If you want to ship on a schedule knowing this will help you determine were to apply that “extra” testing efforts in the long run.  


Testability Scale (1-10): How hard is it to reproduce bugs with this feature?   Can you automate all the tests or will every test have to be done by hand?   If the feature first requires you to create a large amount of dummy data then simulate a crash (Assume you are testing Office document recovery.) then this feature is going to cost you more to test in an ad-hoc fashion and find bugs over time with. 


Complexity (1-10): How complex is this feature?  Is the implementation going to be complex?  After you have more information about LOC, # of code paths, or whatever your favorite complexity metrics are you can nail this one down more. 


Security Risk (1-10): Using whatever criteria you like; determine if the feature represents a large security risk for your product.  Security testing is both important to get right and costly to apply to a feature.  Giving a high number here will mean you will want to spend a lot of time thinking like a hacker and using exploratory methods to try and break through so, in the real world, you’re satisfied with the reduced risk. 


X-Factor (1-10): Ok, maybe this is a cop out, but there needs to be a way to take an educated guess at even more uncostables.  This could include your estimate of how well understood the problem space is.  For example: Completely new, innovative, or uncharted feature will present much higher costs than one based on existing works.  Another reason for a high number here could be that you think the feature will present problems on international systems.  I’m asking for a best guess here.


What does this all add up to?

Add up the numbers, divide, and you get a relative test cost on a scale of 1-10.  J We’re just starting to think about this so I can’t tell you how many features you can add up on one person or how much time it takes to test a feature with a cost of 7 it’s something that I’m hoping to figure out over time combined with historical data we have.  I guess I’m really searching for a numerical way to represent the things that product planners should be thinking of when they evaluate choices between features from a test perspective. 


I’m also searching for a useful acronym with which to represent these.  I guess I’d like to find out if I’m missing something before deciding on STICHIX or CHIXIST.  (I’ll also take suggestions on this.)


What can I cost with real numbers?

“Your pseudo science frightens me, what can I cost with real numbers?”  Ok, so the above scales aren’t based on exact complex differential equations, but then again neither was Moores Law.  If you are looking for real costs then you’ll just have to itemize and cost in # of real days, compressed hours, monkees required to cover everything, or whatever suites you best the procedures followed by your team.  I’d like to recommend you hold off on getting these numbers until you have written a basic test plan for the feature so you can get relatively low level and say, for example, “Feature A will require testing functionality Z with about X number of tests that take Q time to develop per test.” You could also try out old standby’s like “If it takes someone 1 day to write it then you’ll spend 2 days testing it.” However you want to do it; these, tangible, static costs are also important present when looking at a feature and planning your resources. 


I’ll make no determination just yet on where the pseudo science and tangible costs cross other than knowing they are both important.  I’m just thinking outloud here looking for some feedback on these thoughts.  

- josh