Should arrays get axed from future languages?

In some ways the question is shocking. Arrays have been built into programming languages since ancient times.  Arrays provide the foundation for many data structures and form the subject of an entire chapter in most introductions to programming. Why meddle with success?

Well, arrays have always been problematic data structures. In FORTRAN they were the source of semantic ambiguity. In Pascal they were the source of type checking problems. In their C/C++ incarnation they have wreaked all sorts of havoc. In their Java/C# incarnation they are so unsatisfactory that no self respecting API can dare use them.

To expand a bit on their C# incarnation: arrays are always mutable, which is a big problem when you need immutable data structures. Nevertheless they always have a fixed length, which is a big problem when you need mutable data structures. They are covariant, but only for arrays of reference types. Covariance is great for read-only arrays, but those don’t exist. Covariance is of little use for mutable arrays, but to use those you must pay the price of a runtime check for every element assignment.

In short, C# programmers should hide arrays deeps inside of fundamental data structures such as the generic List class of the CLR. Unfortunately, the exalted status conferred on arrays by their special syntax and dedicated book chapters makes programmers far too likely to sprinkle arrays all over public APIs. That alone is probably enough justification for wielding the axe.

But merely replacing “int[]” with “List<int>” is probably not going far enough. At the very least it should be “IList<int>” to provide some flexibility for the implementation. But that is not the whole story either: the very idea of random access needs rethinking.

For small arrays, it does not matter very much if one accesses the array elements in sequence or randomly. For large arrays, it matters a great deal. Missing the L1 cache is bad, missing the L2 cache is very bad and taking a page fault is a disaster.

Perhaps the time has come to rediscover and reinvigorate streams as the basic data structure for collections. C# v2 has made a step in this direction with iterators and C# v3 has turned it into a leap with LINQ.

I believe that C# v3 points the way to the future. A new champion programming language will have to embrace streams and make them a central part of its programming model. Perhaps lazy functional languages still have lot more to contribute in this respect than the rather indirect influence they have so far had on C#.