Why no ++?

Why no ++?

  • Comments 29

Oddly enough, my posting on how to handle a mad crush has become the third most popular article I've written so far.  Who knew there were so many 10 year old girls interested in programming language design? I may have to turn this into an agony column. I've posted some more techniques for handling a mad crush, for the benefit of all you 10 year old girls reading my blog. 

In other news, I was going through my email archive the other day and I found a discussion from back in the days when we were designing the VB.NET syntax that I thought might give some insight into the kinds of considerations that go into some of the smaller decisions.

A lot of people have opined that VB.NET and C# are in many ways the "same language", just with different syntax. If you were to take a chess board and change it from black and white to red and blue and you renamed all the pieces, but didn't change the more fundamental rules about the ways the pieces moved, the new game would in a very real sense be "the same game as chess," right? Whether VB.NET and C# really are in that sense mere syntactic variations on each other is a debatable point -- I think it is not nearly so cut-and-dried as some people think. There are a lot of reasons I think that, and I don't want to go into all of them today. One of the most interesting and important to me though is the most ineffable -- the "spirit" of a language in many ways both transcends its syntax and suffuses some small decisions in interesting ways.

Let's consider just one small example. VB6 did not have the += operator familiar to all C-like-language developers. In VB6 if you wanted to add ten to a number you said

x = x + 10

Looking at that, there's some textual redundancy there. Clearly the intention of the programmer is "increment the variable by ten", so why do we have to state the variable twice? This is an extremely common, simple operation so we can have a short-cut syntax for it that expresses it more compactly. (Such a shortcut, whereby a relatively cumbersome but legal syntax is replaced by a simpler syntax that adds no additional real representational power to the language is called a syntactic sugar.)

x += 10

And indeed, VB.NET has this operator.

Since incrementing a variable by one is also an extremely common operation, C-like languages have an even more compact syntactic sugar for this operation: x++ or ++x

When the decision was made to add the += operator to VB.NET, one of our consultants commented

I am pleased to see the += construct (and I assume the other similar constructs) but the increment and decrement operators ++ -- are missing; this seems like an oversight. […] if the += etc. are allowed then it is natural that the ++ be allowed.

Actually, it isn't as natural as you might think, because VB.NET and C# are NOT mere syntactic variations on each other, and nor should they be. The reason why VB.NET doesn't have the increment operator illustrates the difference in spirit between VB.NET and C#. In fact, it is NOT the case that a C-like k++ is syntactic sugar for a VB-like k += 1 This is because in VB there has always been a strong line drawn between statements and expressions. Not so in languages like C. This is a perfectly legal C (and, for that matter, JScript) fragment, for instance:

{
  2 + 2;
}

Not a particularly useful fragment, one that would probably produce a compiler warning, but legal -- because in C, a bare expression with no side effects is a legal statement.

In C it is also legal to go to the other end of the spectrum and say

x = (k += 1);

k += 1 is an expression in C which returns the assigned value and as a side effect assigns the value to k (and then to x). Most sensible people never use this fact about the += operator, but it is nonetheless true in C because C makes weak distinctions between statements and expressions.

That's not the case in VB.NET. In VB k += 1 is not an expression, it is a statement, and ne'er the twain shall meet.  Such a bizarre construction is a syntax error in VB.NET.

But k++ is an expression. You can say in C

x = (k++) * (++k);

And get both the side effects (two increments), the multiplication and the assignment to x. This is an extremely common usage in C-like languages, so if we were going to add the ++ operator, then developers would expect that k++ is an expression in VB.NET, not a statement. (The fact that they are expressions explains why there are two forms -- one form returns the value of the variable before it is incremented, the other after.)

If you make the increment operator only legal in statements -- alone on a single line -- then saving that keystroke is clearly not worth the testing effort that would go into this -- much less the dev effort, the documentation effort, the localization effort, the specs that would have to be written, etc.

But if you make it a legal expression in VB.NET, you run into all kinds of problems. This operator causes problems in C-like languages, and it would cause the same problems in VB.NET.

Consider a hypothetical world in which k++ and ++k are legal expressions. We would have to come up with some definition of what this code does:

Function Foo(ByRef x, ByVal y)
  Foo = x * y
  x = 10
End Function

k = 100
z = Foo(k++, ++k)

First of all, what numbers get passed to Foo? Is this the same as

z = Foo(100, 101) -- stick k on the stack, do ++k, stick k on the stack, do k++

or this?

z = Foo(100, 102) -- stick k on the stack, do k++, do ++k, stick k on the stack

or this?

z = Foo(101, 101) -- do ++k, stick k on the stack, stick k on the stack, do k++

Second, does the k++ pass a reference to k to the byref parameter, or does it pass it by value? Suppose it passes it by reference -- then what does k equal when Foo returns? Do we do it like this:

Stick k on the stack, do ++k, stick k on the stack, do k++, call Foo, assign 10 to k

or this:

Stick k on the stack, do ++k, stick k on the stack, call Foo, assign 10 to k, do k++, so it's 11.

Or, if it passes by value then k is never set to 10, so we're fine -- k is definately 102 when we're done as it is incremented twice.

We could come up with some answers to these questions, but they would be rather arbitrary. The answers are apparently sufficiently inobvious that the C++ standard leaves several of them unanswered, and therefore such code is not portable! The question about function parameter list evaluation order is dismissed as "The order of evaluation of arguments is undefined; take note that compilers differ. The order of evaluation of the postfix expression and the argument expression list is undefined." The question about whether k++ runs before or after the function runs is answered: "All side effects of argument expressions take effect before the function is entered" And with regard to whether k++ returns a reference to k -- in C++, the ++ and -- operators take lvalues and return lvalues, so yes, if VB.NET worked like C++ in this regard, evaluating k++ as an argument would have to pass a reference to k to a function expecting a byref argument.

Clearly these problems are solvable -- we solved them in C#, obviously! -- but you have to ask yourself how valuable saving that keystroke is if it adds these kinds of complexities to the language semantics. Adding complexity isn't necessarily bad, but you should get value for your additional complexity, not a single keystroke saved!

k++ simply is not very BASIClike. Maybe that marks me as hopelessly old-fashioned that I'm saying that anything is not BASIClike when we have a BASIC that has object polymorphism! Call me old-fashioned then -- but I think that VB.NET is not and should not be "C# without braces" and that there are expression idioms which don't particularly make sense in VB. Heck, I would argue that ++ is an idiom which does not work particularly well in C, C++, Java, JScript or C#. Sure, ++ in C lets you write very dense code that can only be read if you understand the particular idiom of C -- but dense code is not necessarily fast code or maintainable code or readable code or correct code. Personally I only ever use ++ in loop incrementers and when walking strings a byte at a time -- I'll gladly change ++'s to +=1's.

I'm all for adding syntactic sugar that makes code less verbose. += is a great example of that as it adds nothing new to the language, it just makes an existing common operation lexically shorter. It's a real sugar. But increment operators are not mere sugar; they add new functionality, opening up immense cans of worms at huge cost for small gain.

  • A few comments:

    > k += 1 is an expression in C which returns the assigned value and as a side effect assigns the value to k (and then to x). Most sensible people never use this fact about the += operator, but it is nonetheless true in C because C makes weak distinctions between statements and expressions.

    Most people may not use this feature of +=, but many do use this feature of the shift operators << and >>, if they use C++ stream I/O that is. Also, given that += is used for adding delegates in C# (a bad choice IMO), this feature might be used to add multiple delegates at once.

    > Personally I only ever use ++ in loop incrementers and when walking strings a byte at a time -- I'll gladly change ++'s to +=1's.

    While I don't know it for a fact, I consider the C/C++ for looping construct to be the prime motivation for the ++ operator. That and the fact that C was designed as a sort of high-level assembly, and many assembly languages have an INC instruction.

    BTW, remember to use postfix ++ and -- in C++, at least of iterators.

    I must say that for me a distinction based on the "spirit" of the language is too vague. The type of distinctions I'm looking for are more along the lines of "the differences between languages X and Y make X more suitable for this class of applications and Y more suitable for that class of applications". Unfortunately I still can't figure out when Microsoft expects me to use C#, VB.NET or JScript.NET.
  • Prefix, Dan. Prefix.
  • I learned BASIC on an Altair and very much appreciate all efforts to retain the "spirit" of the language.

    The main difference, in my opinion, between the two languages is the balance between power and complexity. The optimum choices for both is not an easy thing.
  • >>Oddly enough, my posting on how to handle a mad crush has become the third most popular article I've written so far. Who knew there were so many 10 year old girls interested in programming language design?<<

    It's because you're just a fabulous writer [a.k.a. entirely too talented for one human :)], and a subject like that shows it off to great effect. And makes me giggle.

    At least that's why I've read it more than twice.

  • Hey, did you just send me a mash note? :-)

    Thanks, that's a nice thing to say.

    So what do you do when you have a mad crush on a boy?
  • I wonder if a post on "what to do when you have a mad crush on a _girl_" might actually be more useful to the demographic for your blog. :-) With all respect to readers such as " A girl, but not 10 years old."

  • > Prefix, Dan. Prefix

    Whoops, my bad. Never post after midnight ...
  • >>Note that this means that nonsense like k += --(--k++)++); is legal C++

    Nope, it's not. ++/-- operators require l-value but do not return them. So --k++ is not legal. You also can't pass k++ by reference. Also using several side-effects on the same variable in one statement is hardly used by anyone in his right mind, because the result is unpredictable.

    Don't know how this behaves in C# thought.
  • When I did support for borland I remember someone having a problem with converting a two character string to an integer.
    char *x; int y;
    y=(*(x++)-'0')*10 + *(x++)-'0';
    Since the ++ bit is only guaranteed to happen
    after the statement executes this actually turned into:
    y=(*x-'0')*10 + *x -'0'; x+=2;
    Perfectly legal according to the ANSI spec.
  • Are you sure += is only syntactic sugar?

    What about:
    x.longCalculation().y += 1

    I sure hope longCalculation isn't called twice.
  • This reminds me of a discussion I had a couple of years ago when I suggested that the various "End <construct>" statements could have been reduced to simple "End" statements in VB.NET. It was quite possible to do, of course, would save a few characters when typing, and would even simplify automated code generation.
    It was a very un-Basiclike idea though; that small added redundancy can make a huge difference to comprehension of code.
  • Some1: Whoops. You are correct. I've removed the error. Thanks!
  • Sjoerd: OK, then it's syntactic sugar for

    Temp = foo.longcalc()
    Temp.y = Temp.y + 1
  • Eric,

    I had to read your argument marking a line of control between expressions and statements a couple of times to get it into my head. Would you think I've understood the distinction from the following discussion?

    Statement: A statement is a complete program instruction delimited by the statement terminator; a new line in the case of VB.NET and the semi-colon in the case of C-like languages. Two or more statements cannot be compounded to form another statement. As such, a statement is an independent unit of a program.

    Eg.

    Dim IntNumber As Integer <Statement/>
    Dim IntAnotherNumber As Integer <Another Statement/>
    IntNumber +=IntAnotherNumber <Yet Another Statement/>

    Expression: An expression is the smallest unit of a program instruction. Expressions are compounded together to form a statement.

    Eg.

    Dim IntNumber As Integer = 1200
    Dim IntAnotherNumber As Integer = 600
    Dim IntThirdNumber As Integer = 100
    IntThirdNumber = (IntNumber+IntAnotherNumber)/IntThirdNumber

    Here,
    (IntNumber+IntAnotherNumber) is an expression, and the result of this expression divided by IntThirdNumber is a second expression, where as the complete instruction

    IntThirdNumber = (IntNumber+IntAnotherNumber)/IntThirdNumber

    is a statement.

    Under such a scheme of distinction, a statement such as this should be illegal:

    IntNumber = (IntAnotherNumber += IntThirdNumber)

    because (IntAnotherNumber += IntThirdNumber) in itself is an independant statement.

    Seems to make sense. Have I got it right?
  • You're on the right track.

    A more rigorous way to think about it though would be to consider the grammar of the language. Let me give you a quick example of what I mean. Part of the grammar for a simple language might be something like

    * A PROGRAM is a list of STATEMENTS.
    * A STATEMENT is either an ASSIGNMENTSTATEMENT or a CONDITIONALSTATEMENT
    * An ASSIGNMENTSTATEMENT is VARIABLENAME = EXPRESSION
    * An EXPRESSION is an ADDEXPRESSION or a MULTIPLYEXPRESSION or a NUMBER or a VARIABLENAME
    * An ADDEXPRESSION is an EXPRESSION followed by + or -, followed by another EXPRESSION
    * etc.

    In this grammar "x = 1 * 3 + z * 4" is a PROGRAM that consists of one STATEMENT, an ASSIGNMENTSTATEMENT, which consists of an ADDEXPRESSION that consists of two MULTIPLYEXPRESSIONS... etc.

    So what does this have to do with VB vs C? Well, in C, a STATEMENT can be a lot of things, _including_ just an expression. In VB, a STATEMENT cannot be just an EXPRESSION.

    This is particularly important in VB because in a STATEMENT, = means assignment, but in an EXPRESSION it means comparison. (That then explains why C has a different operator for comparison, ==.)
Page 1 of 2 (29 items) 12