In my previous posts I’ve wondered about the future viability of static typing, arrays and loops. Can I get any more out of touch with reality? Well, what about numbers?

Computers are number crunchers, so how can we possibly program a computer with a language that doesn’t know numbers. Surely I jest?

Well yes, but only a little. Numbers are actually very problematic citizens of programming languages. C# is a mess with sbyte, short, int, long, byte, ushort, ulong, float, double, decimal, and even weird pseudo numbers like char, IntPtr, UIntPtr, and pointers. C is even worse, with things such as short, long and long long and rules that do not tell you how long a long is, but do tell you that is at least as long as a short.

And then there are the literals for numbers. C has decimal integer literals, floating point literals, octal literals (which look uncannily like decimal literals, except that 012 != 12), hexadecimal literals and all sorts of weird suffixes. C#, thankfully, gets rid of octal literals, but add suffixes galore. JavaScript had octal literals, then deprecated them, but didn’t quite follow through with that.

Of course, numbers by themselves are somewhat meaningless. Rather than just “5” we might mean “5 meters” or “5 kilograms”. Not surprisingly, there have been attempts at making programming languages aware of the units of numbers. During the early days of the ECMAScript Editon 4 work, a lot of effort went into such a proposal, which got weirder and weirder as time went on and eventually fizzled out and disappeared.

And then there is the small matter of precision. In much of the scientific world, there is a big difference between 5.0 and 5.00. Ada had some interesting ideas here: you could specify the precision needed for a number and then the compiler would select the best implementation that meets the required precision. No other major language picked this up, so the experiment must be considered a failure.

Coming back to C# and its myriad of numeric types, the C# rules for how numeric values are converted from one type to another  are complex and arcane. It can take some careful reasoning and consulting the rather daunting specification to figure out what happens in some cases. There is also a rather tricky interaction between numeric promotion and method overload resolution. At least the C# rules mostly do the right thing. You’ll never accidently turn 255 into -1 or 257 into 1, but you do need to insert explicit casts in many cases where they are not actually needed (which is something you become very aware of when you translate a C program to C#.) Not that C has it right: it happily lets you treat a numeric variable as a signed number in one place and an unsigned number in another, without so much as a “by your leave”.

A big problem these days, when unsuspecting software components are at the mercy of “evil doers” intent on exploring every corner case to see what damage can be done is that most languages provide arithmetic operations that silently overflow. C# lets you specify that particular areas of code should use overflow checking, but it does not encourage you to make use of this checking. And when you do turn it on you are confronted with significant slowdowns as well as the need to handle very costly exceptions every time a number overflows. As a result, few programmers bother to check for overflows, and those who do check, have to pay a heavy price or need to resort to arcane coding practices. A while back I heard a claim that at least 50% of security bugs these days are related to unchecked for arithmetic overflows.

In JavaScript, things are not quite so bad. There is only one numeric type: the IEEE 64-bit floating point number (doubles). Such numbers have the nice property that they do not overflow and that all arithmetic operations are complete (defined for every value).  Yet, JavaScript programmers are not happy campers and the language design committee keeps looking for ways to improve the language. The reason, in a nutshell, is that numeric literals are (mostly) expressed in decimal, while doubles are represented with fixed length binary numbers. This can lead to expressions resulting in numbers that print slightly differently from what high school arithmetic leads us to expect. It also leads to persistent bug reports from incredulous programmers that do not understand the intricacies of fixed length binary floating point numbers.

A persistent proposal is the replace binary floating point numbers with decimal floating point numbers, which is probably an improvement (if one could just change the meaning of existing programs, which we cannot). But this still does not solve the problem of finite precision, which rounding mode to choose and how to deal with the fact that in much of Science, 1.0 is not the same as 1.00.

All this makes we wonder if we should remove the special status that finite precision numbers currently enjoy in programming language. Perhaps the default interpretation of a numeric literal should be: a Rational number represented as a pair of arbitrary precision decimal numbers. All other kinds of numbers should be objects with same status as any other object in the language. Compilers can have special knowledge of some of these types of objects, but this should be an implementation detail, not a language feature.

Of course, compilers should also do an outstanding job of not making you pay the full price of using arbitrary precision Rational numbers as your default numeric type. I suspect, however, that once you remove arrays and loops from a programming language, the performance of arithmetic operations will be much less of an issue.