I tried to get my automation script for payment verification as part of the napkin math area checked in.  One of the testers on my team rejected it.  Here's why.


First, remember the equation I was using:



Ultimately, this is a hard coded string that the script enters on a page (up to the = sign) and then reads the result, again as a string, from the page once OneNote computes the result.  So my script boils down to looking for the result of the calculation being exactly the string 1813.033713614259.  I am verifying the expected result of the equation to be exactly the same with every build of OneNote.


This is not a great way to verify mathematical results, though.  There are two problems here.


One was pointed out as a design flaw last week here.  Seth indicated that money based functions like this should have only 2 digits of precision (for US dollars) and maybe there should be a setting to control the level of precision.  OneNote tries to minimize the complexity of the formatting surface, and when the feature was designed, the team that worked on it decided to have these functions go to "as much precision as possible."  Also, we don't want to flood the user with too many settings - OneNote is designed to be a lightweight formatting application.  Seth has a point, though.  It's just two different views of the same problem.


But the reason my script failed code review was the exactness of comparing the actual result to the expected.  If you dig into computer based mathematical comparisons, you quickly learn that during the conversion from base 10 to binary and back, rounding errors are common.  In this case, if we change compilers at some future point the result of this equation could be something like 1813.033713614258.

Notice the least significant digit here is now 8 instead of 9.


The suggested way of verifying the mathematical result of these operations is to adopt an "equalish" type of comparison.  I could modify the test to only check for the first 5 digits of precision, or look at a range of values around the expected (which I would compute beforehand) to see if the result falls within  the expected range.  Then I would also need to trust the less than and greater than comparisons would work as expected after the compiler changed.  This loses mathematical "rigor" and goes far past the break even point needed to make automation worthwhile.  I would spend far too much time creating the system to ever get the payback from the automation runs.  


This is why I prefer my method of string comparison.  If OneNote 2007 computes the answer to end with "9," and the answer is pre-verified to be within the range of  valid floating point answers (it is), then future versions of OneNote should produce the exact same value. 


Now I will get an alert if we change compilers and the results change.  This is a big win in itself.  Suppose we change to a "dogfood" compiler and the answer turns out to be 1813.03111111111.  Clearly, this is very different than expected, and depending on what level of "equalish" comparison I adopted, might not be detected by automation.  Other behavior might change as well, and without this clue left by the failing automation, tracking down the problem would take a greater amount of time.


So the tester who rejected me came by and we talked (and talked, and debated, then finally communicated )about the necessity of checking for expected results.  In this case, since the expected result is a valid result, we decided to check in the script using the new, exact string comparison verification.  On to more scripts!


If you are really interested in getting more details about "equalish" mathematical comparisons, check these pages:




Questions, comments, concerns and criticisms always welcome,