Language Readability vs. Writability

Published 01 July 04 11:57 AM

In my previous post, I said:

 

“Unfortunately, language readability is often at odds with writability.”

 

And

 

“Generic method type parameters are inferred from the concrete parameters”

 

Here’s what I’m talking about:

 

            T F<T>(T t) { return t; }

 

            void Test()

            {

                  Console.WriteLine(F(1) + 2);

            }

 

Which overload of Console.WriteLine gets called? 

 

The compiler sees that you’re passing ‘1’ to F(), which is an int.  So, the concrete parameter ‘t’ is an int, so the type parameter ‘T’ is an int, so F() returns an int.  Add an int to ‘2’ and you get an int.  So, we are calling

 

            public static void WriteLine(int value);

 

I don’t think that’s immediately obvious from the code, but it sure is easier to write than:

 

                  Console.WriteLine(F<int>(1) + 2);

 

The other inference added in C# 2.0 is with delegates.  Consider:

 

            delegate void D();

 

            void F() { }

 

            void Test()

            {

                  D d = F;

                  d();

            }

 

It’s certainly easier to write than:

 

                  D d = new D(F);

 

But because it’s inferred, it’s harder to read, both for tools & for devs.

 

The final point was about ‘foreach’.  Today you write:

 

            void Test(List<int> myList)

            {

                  foreach (int i in myList) { }

            }

 

But why not let you write:

 

            void Test(List<int> myList)

            {

                  foreach (i in myList) { }

            }

 

Can’t the compiler figure out the type of ‘i' for you?  Yes, but now you’ve compromised readability:

 

                  foreach (i in foo.bar(baz(x))) { }

 

Now, to know the type of ‘i', you have to correctly figure out which overload of ‘bar’ is called, which requires figuring out the correct overload of ‘baz’, too.  It could be quite complicated.

 

Since I’m on the editor team, I think solution lies in an editor.  When making a decision about read vs. write, one option is to select easy-to-read, and then use the editor to make it easy to write.  For example, a smart editor could infer & insert the type in the ‘foreach’ for you automatically. 

 

Hmm, guess I need to get to work!

Comments

# Kevin Dente said on July 1, 2004 2:10 PM:
Hey, while you're at it how about automatically inserting type specifications when declaring in-line arrays. E.g. "{1,2,3}" becomes "new int[] {1,2,3}". :)
# jaybaz [MS] said on July 1, 2004 2:16 PM:
Kevin: I'll check it out.
# stevencl's WebLog said on July 2, 2004 6:52 PM:
# Jeremy Marsch said on July 9, 2004 12:24 PM:
I hope I'm not too late on this comment. In terms of readability vs. writeability, our shop went with readability to extreme ends.

We require that all type names are fully qualified (we don't use the using statement for aliasing namespaces). So, all variable/parameter/return value declarations are fully qualified. We also preface all member items with this. when we refer to them.

The refactoring stuff in the editor looks really exciting. Is it possible to have an options to tell the editor to always use fully qualified type names and to use this. in the code that it writes on our behalf?
# jaybaz [MS] said on July 9, 2004 12:44 PM:
Never too late!

Jeremey: We'll use the FQN if the 'using' directives aren't there. So, I think you'll be happy with the outcome.
# Darren Oakey said on July 12, 2004 5:37 PM:
Hmm.. I think this is an odd post, because in all the explanations and examples, you've assumed something which I is not just "not necessarily true" - bust is actually rarely or even never true,
... that is - you've assumed for some reason that more information==more readable.. That's completely untrue - in fact, more irrelevant information decreases readability.

Why?
Well, when you are reading or maintaining code, what is most important is understanding THE INTENT of the code. It is NOT even vaguely important, in the first few runs through, to understand THE MECHANICS of the code. For example:

> Console.Writeline( Square(5) );
compared to
> Console.Writeline( Square<int>(5) );

We read the first example, and we understand completely. You say it's not obvious WHICH console.Writeline is called, but I would argue that you neither NEED nor WANT to know. Anyone who tries to understand the underlying libraries at the same time as the piece of code they are actually investigating is a) doing a lot more work than they should and b) feeding too much information into our very limited brains, thus == confusion.

What you need to know is that 25 will appear on the screen - you don't need to know HOW. (unless there is a bug, you are confident Square works, then you can go and right click and go to definition, and try to track it down). But the main point is, the function is actually more readily understood _without_ knowing which overload you are calling (or that the base function is actually overloaded at all.

That's why I have a problem with all the examples above. They focus on the unimportant information that is being shown (type specifiers and the like)... That's hungarian thinking... But the examples don't have real variable names, which is where readability actually needs to come in. For instance I would shoot anyone in my team who wrote

> foreach (i in foobar(x))

That's just bad coding. But do we make the same conclusions about readability if we write it as it would actually appear - eg:

> foreach (employeeToCancel in theSetOfRetrenchedEmployees)

is it any more readable to see

> foreach (Employee employeeToCancel in theSetOfRetrenchedEmployees)

I'd say no. It's just like people say of comments - comments should say what the code INTENDS, not what the code DOES - people can just read the code for that. The same thing with things like type inference etc - a good coder will always make that sort of thing obvious anyway - for instance we have an absolute unbreakable rule that no function (including the comments header) can be more than 25 lines... which means whenever you are looking at a function, you can ALWAYS see the whole function - so always SEE the type definitions and so forth. Having them repeated on the lines underneath is just redundant and unclear.

One more example - how much more readable is

> ShowColorChartsFor( {Color.Green, Color.Red, Color.Blue } );

compared to

> ShowColorChartsFor( new Drawing.Color [] {Color.Green, Color.Red, Color.Blue } );

The latter gives you more _information_ but that information does not actually help your _understanding_ of the functionality, in fact it hinders it..
Chomsky in his studies on human learning explains a phenomenon called blocking, which says that expected or redundant information actually inhibits comprehension [for instance if you show a kid the written word tree, and say "tree" - they will learn it, but if you show them the word tree near a picture of a tree, they will NOT learn it - because our brains are lazy and focus the easiest think they can - the kid knows the link between the picture and the spoken word, so do not need to form a link between the written word and the spoken word]

In the same way, in the lines above, your brain will zip off in a tangent thinking about the construction of a string array, and miss the point, which is that you are trying to show color swatches for the colors red, green and blue. It is very important in code that we try to get the _mechanics_ out of the syntax/out of visibility as much as possible, so people can focus on the _intent_.
# jaybaz [MS] said on July 12, 2004 5:45 PM:
Darren: very interesting. Thanks for taking the time to write it all out.

It'll take my brain a little time to process, for sure.

Oh, and +2 points for referencing Chomsky. :-)
# jaybaz MS WebLog Language Readability vs Writability | My Site said on May 31, 2009 11:57 PM:

PingBack from http://patiochairsite.info/story.php?id=306

New Comments to this post are disabled

This Blog

Syndication

Page view tracker