In my previous posts I’ve wondered about the future viability of static typing, arrays and loops. Can I get any more out of touch with reality? Well, what about numbers?
Computers are number crunchers, so how can we possibly program a computer with a language that doesn’t know numbers. Surely I jest?
Well yes, but only a little. Numbers are actually very problematic citizens of programming languages. C# is a mess with sbyte, short, int, long, byte, ushort, ulong, float, double, decimal, and even weird pseudo numbers like char, IntPtr, UIntPtr, and pointers. C is even worse, with things such as short, long and long long and rules that do not tell you how long a long is, but do tell you that is at least as long as a short.
Of course, numbers by themselves are somewhat meaningless. Rather than just “5” we might mean “5 meters” or “5 kilograms”. Not surprisingly, there have been attempts at making programming languages aware of the units of numbers. During the early days of the ECMAScript Editon 4 work, a lot of effort went into such a proposal, which got weirder and weirder as time went on and eventually fizzled out and disappeared.
And then there is the small matter of precision. In much of the scientific world, there is a big difference between 5.0 and 5.00. Ada had some interesting ideas here: you could specify the precision needed for a number and then the compiler would select the best implementation that meets the required precision. No other major language picked this up, so the experiment must be considered a failure.
Coming back to C# and its myriad of numeric types, the C# rules for how numeric values are converted from one type to another are complex and arcane. It can take some careful reasoning and consulting the rather daunting specification to figure out what happens in some cases. There is also a rather tricky interaction between numeric promotion and method overload resolution. At least the C# rules mostly do the right thing. You’ll never accidently turn 255 into -1 or 257 into 1, but you do need to insert explicit casts in many cases where they are not actually needed (which is something you become very aware of when you translate a C program to C#.) Not that C has it right: it happily lets you treat a numeric variable as a signed number in one place and an unsigned number in another, without so much as a “by your leave”.
A big problem these days, when unsuspecting software components are at the mercy of “evil doers” intent on exploring every corner case to see what damage can be done is that most languages provide arithmetic operations that silently overflow. C# lets you specify that particular areas of code should use overflow checking, but it does not encourage you to make use of this checking. And when you do turn it on you are confronted with significant slowdowns as well as the need to handle very costly exceptions every time a number overflows. As a result, few programmers bother to check for overflows, and those who do check, have to pay a heavy price or need to resort to arcane coding practices. A while back I heard a claim that at least 50% of security bugs these days are related to unchecked for arithmetic overflows.
A persistent proposal is the replace binary floating point numbers with decimal floating point numbers, which is probably an improvement (if one could just change the meaning of existing programs, which we cannot). But this still does not solve the problem of finite precision, which rounding mode to choose and how to deal with the fact that in much of Science, 1.0 is not the same as 1.00.
All this makes we wonder if we should remove the special status that finite precision numbers currently enjoy in programming language. Perhaps the default interpretation of a numeric literal should be: a Rational number represented as a pair of arbitrary precision decimal numbers. All other kinds of numbers should be objects with same status as any other object in the language. Compilers can have special knowledge of some of these types of objects, but this should be an implementation detail, not a language feature.
Of course, compilers should also do an outstanding job of not making you pay the full price of using arbitrary precision Rational numbers as your default numeric type. I suspect, however, that once you remove arrays and loops from a programming language, the performance of arithmetic operations will be much less of an issue.