Why no ++?

Why no ++?

  • Comments 29

Oddly enough, my posting on how to handle a mad crush has become the third most popular article I've written so far.  Who knew there were so many 10 year old girls interested in programming language design? I may have to turn this into an agony column. I've posted some more techniques for handling a mad crush, for the benefit of all you 10 year old girls reading my blog. 

In other news, I was going through my email archive the other day and I found a discussion from back in the days when we were designing the VB.NET syntax that I thought might give some insight into the kinds of considerations that go into some of the smaller decisions.

A lot of people have opined that VB.NET and C# are in many ways the "same language", just with different syntax. If you were to take a chess board and change it from black and white to red and blue and you renamed all the pieces, but didn't change the more fundamental rules about the ways the pieces moved, the new game would in a very real sense be "the same game as chess," right? Whether VB.NET and C# really are in that sense mere syntactic variations on each other is a debatable point -- I think it is not nearly so cut-and-dried as some people think. There are a lot of reasons I think that, and I don't want to go into all of them today. One of the most interesting and important to me though is the most ineffable -- the "spirit" of a language in many ways both transcends its syntax and suffuses some small decisions in interesting ways.

Let's consider just one small example. VB6 did not have the += operator familiar to all C-like-language developers. In VB6 if you wanted to add ten to a number you said

x = x + 10

Looking at that, there's some textual redundancy there. Clearly the intention of the programmer is "increment the variable by ten", so why do we have to state the variable twice? This is an extremely common, simple operation so we can have a short-cut syntax for it that expresses it more compactly. (Such a shortcut, whereby a relatively cumbersome but legal syntax is replaced by a simpler syntax that adds no additional real representational power to the language is called a syntactic sugar.)

x += 10

And indeed, VB.NET has this operator.

Since incrementing a variable by one is also an extremely common operation, C-like languages have an even more compact syntactic sugar for this operation: x++ or ++x

When the decision was made to add the += operator to VB.NET, one of our consultants commented

I am pleased to see the += construct (and I assume the other similar constructs) but the increment and decrement operators ++ -- are missing; this seems like an oversight. […] if the += etc. are allowed then it is natural that the ++ be allowed.

Actually, it isn't as natural as you might think, because VB.NET and C# are NOT mere syntactic variations on each other, and nor should they be. The reason why VB.NET doesn't have the increment operator illustrates the difference in spirit between VB.NET and C#. In fact, it is NOT the case that a C-like k++ is syntactic sugar for a VB-like k += 1 This is because in VB there has always been a strong line drawn between statements and expressions. Not so in languages like C. This is a perfectly legal C (and, for that matter, JScript) fragment, for instance:

{
  2 + 2;
}

Not a particularly useful fragment, one that would probably produce a compiler warning, but legal -- because in C, a bare expression with no side effects is a legal statement.

In C it is also legal to go to the other end of the spectrum and say

x = (k += 1);

k += 1 is an expression in C which returns the assigned value and as a side effect assigns the value to k (and then to x). Most sensible people never use this fact about the += operator, but it is nonetheless true in C because C makes weak distinctions between statements and expressions.

That's not the case in VB.NET. In VB k += 1 is not an expression, it is a statement, and ne'er the twain shall meet.  Such a bizarre construction is a syntax error in VB.NET.

But k++ is an expression. You can say in C

x = (k++) * (++k);

And get both the side effects (two increments), the multiplication and the assignment to x. This is an extremely common usage in C-like languages, so if we were going to add the ++ operator, then developers would expect that k++ is an expression in VB.NET, not a statement. (The fact that they are expressions explains why there are two forms -- one form returns the value of the variable before it is incremented, the other after.)

If you make the increment operator only legal in statements -- alone on a single line -- then saving that keystroke is clearly not worth the testing effort that would go into this -- much less the dev effort, the documentation effort, the localization effort, the specs that would have to be written, etc.

But if you make it a legal expression in VB.NET, you run into all kinds of problems. This operator causes problems in C-like languages, and it would cause the same problems in VB.NET.

Consider a hypothetical world in which k++ and ++k are legal expressions. We would have to come up with some definition of what this code does:

Function Foo(ByRef x, ByVal y)
  Foo = x * y
  x = 10
End Function

k = 100
z = Foo(k++, ++k)

First of all, what numbers get passed to Foo? Is this the same as

z = Foo(100, 101) -- stick k on the stack, do ++k, stick k on the stack, do k++

or this?

z = Foo(100, 102) -- stick k on the stack, do k++, do ++k, stick k on the stack

or this?

z = Foo(101, 101) -- do ++k, stick k on the stack, stick k on the stack, do k++

Second, does the k++ pass a reference to k to the byref parameter, or does it pass it by value? Suppose it passes it by reference -- then what does k equal when Foo returns? Do we do it like this:

Stick k on the stack, do ++k, stick k on the stack, do k++, call Foo, assign 10 to k

or this:

Stick k on the stack, do ++k, stick k on the stack, call Foo, assign 10 to k, do k++, so it's 11.

Or, if it passes by value then k is never set to 10, so we're fine -- k is definately 102 when we're done as it is incremented twice.

We could come up with some answers to these questions, but they would be rather arbitrary. The answers are apparently sufficiently inobvious that the C++ standard leaves several of them unanswered, and therefore such code is not portable! The question about function parameter list evaluation order is dismissed as "The order of evaluation of arguments is undefined; take note that compilers differ. The order of evaluation of the postfix expression and the argument expression list is undefined." The question about whether k++ runs before or after the function runs is answered: "All side effects of argument expressions take effect before the function is entered" And with regard to whether k++ returns a reference to k -- in C++, the ++ and -- operators take lvalues and return lvalues, so yes, if VB.NET worked like C++ in this regard, evaluating k++ as an argument would have to pass a reference to k to a function expecting a byref argument.

Clearly these problems are solvable -- we solved them in C#, obviously! -- but you have to ask yourself how valuable saving that keystroke is if it adds these kinds of complexities to the language semantics. Adding complexity isn't necessarily bad, but you should get value for your additional complexity, not a single keystroke saved!

k++ simply is not very BASIClike. Maybe that marks me as hopelessly old-fashioned that I'm saying that anything is not BASIClike when we have a BASIC that has object polymorphism! Call me old-fashioned then -- but I think that VB.NET is not and should not be "C# without braces" and that there are expression idioms which don't particularly make sense in VB. Heck, I would argue that ++ is an idiom which does not work particularly well in C, C++, Java, JScript or C#. Sure, ++ in C lets you write very dense code that can only be read if you understand the particular idiom of C -- but dense code is not necessarily fast code or maintainable code or readable code or correct code. Personally I only ever use ++ in loop incrementers and when walking strings a byte at a time -- I'll gladly change ++'s to +=1's.

I'm all for adding syntactic sugar that makes code less verbose. += is a great example of that as it adds nothing new to the language, it just makes an existing common operation lexically shorter. It's a real sugar. But increment operators are not mere sugar; they add new functionality, opening up immense cans of worms at huge cost for small gain.

  • >In VB, a STATEMENT cannot be just an EXPRESSION.

    Thanks! I get it.
  • Eric, you're right in reply to Sjoerd that it's sugar for:

    Temp = foo.longcalc()
    Temp.y = Temp.y + 1

    But note that you can't do that in C with the same efficiency. In languages where everything is a primitive or a pointer (like Java, C#, VB), it's no longer necessary because the temp doesn't incur any additional overhead.

    I also think using += makes the code clearer:

    xs[x+y*rb-j] += 3;

    because, if you were to see

    xs[x+y*rb-j] = xs[x+y*rb-j] + 3;

    instead, it wouldn't be clear if the subscripts should be the same or not, and begs for bugs when the code changes.

    I also believe that the expression/statement distinction is still only a surface syntactic issue. There are transformations in C# such that anything you can do in it, you can do in VB.

    NB: Well, theoretically, that's true for any TC language to any other, but such transformations can be achieved with nothing smarter than a pass-once parser, which need not do any global changes. Felleisen's defintion of expressiveness, therefore, would argue that these languages have the same expressiveness.
  • Different thoughts on the operators.
  • Such an argument -- that there is a relatively simple morphism between C# and VB which preserves computations -- is certainly interesting from a computer science perspective.

    However, the vast majority of users of both C# and VB are not professional computer scientists and they really don't care much about notions like expressivity!

    As you point out, the most expressive class is that of the Turing Complete languages. The fact that any computable problem can be solved in either language is not particularly germane to the designed-in differences between the languages! The designed-in differences have a lot more to do with practical differences between different developer constituencies than any particular class of computational tasks.

    I'm hoping Paul Vick talks about this a bit in his blog later this week. It's a big topic.
  • Thanks for the reply... My argument was that, while all TC languages can do "the same" thing, they do not share the same expressiveness. For example, if I were to translate some pure lambda calculus to Scheme, it would be fairly simple, as Scheme is a superset (of sorts) of it. Likewise, any Scheme program could be turned into pure Lambda caculus. But now suppose that we take the program converted to Scheme, and then throw some assignments into it, giving it side-effects here and there. It could still be translated to lambda calculus, but it would require global changes to the code. Thus, Scheme is more expressive than lambda calculus, though computationally they can do the same thing.

    With the C# and VB models, changes from one to the other are easy, and changes made to the converted programs are easy and always local (however, I'm not an expert with VB.NET, so there may be a special feature I'm missing). Thus, there really only is a superficial difference between the two.

    You're absolutedly right that the common programmer doesn't care about these finer language points: I know so many people who think inner-classes in Java are a pain because they require "too much typing." The semantics of Java inner-classes are superbe: they are basically lexical closures from Scheme; but if you were to use them as you would lambda in Scheme, the code would look quite large, and the meaning would be less aparent. So, we end up with iterators and less expressive features that are, at least, easier to type.
  • I've missed the syntax:

    {ref. expression} := * {op} {expression}

    That is available in Burroughs Algol (Unisys ClearPath NX/LX family machines).

    The "*" basically tells the compiler to re-use the reference on the stack that resulted from resolving the reference expression whether a scalar entity or an array element, etc.

    More flexible then +-/-= because it can be used to set/clear bitfields, etc. as well as merely inc/dec. Also it might be a bit more "Basic-like" then +=/-=, though it doesn't (necessarily) touch the issues surrounding ++var or var++, et al.

    After all, that Algol compiler allows embedded assignments in expressions though their use is frowned on in most instances:

    {ref1} := {exp1} {op1} ({ref2} := * {op2} {exp2})

    ... and so on.
  • I always hate it when someone says "syntactic sugar" because it is usually condescending and dismissive, and indicates that I'm not going to get the language feature I want! :)
  • Eric based on what you've already written you are probably not going to agree me, but...

    I hear what you are saying about the spirit of the BASIC language, but I would argue that the spirit should not be maintained such that you lose sight of the real goals. For example, just because an SUV is not a sports car does not mean it shouldn't incorporate technologies from sports cars that improve handling and performace if and when it helps it achieve its own goals.

    For me, I think VB.NET should be the easiest tools to create readable, robust, and maintainable business systems with business systems being anything a business might want to implement for internal use and an ISV might want to implement for a vertical (yes there are holes in this definition, but its late so forgive me.) Operating systems and printer drivers come to mind as being not business systems, but almost everything above that does. VB should be able to be used to implement robust and powerful server software, for example.

    So just because VB originally maintained statements and expressions as seperate and distinct should keep you from evolving the language. I don't see any reason why VB can't be evolved to allow more powerful expressions. The dBase language was very much a statement vs. expression language, and one of the best programming languages I've ever used was a dBase compiler called Clipper that extended the language in exactly that way (I know, I wrote a ~1000 page book on it!) Clipper v5.0 added support for ++. --, and inline assignment using the ":=" operator as well as a lot of other "syntactic sugar" <shudder; I hate that phase>.

    So I actually agree that ++ and -- don't make sense to be added on their own. But what I think it would make sense would be to support richer expressions in VB such as an inline assignment operator like ":=" and to also support ++ and --. Allowing inline assignment in an expression would not require that you treat all statements is if they were expressions. As you've already made quite clear, VB.NET and C# are not exactly alike so this aspect would not have to behave exactly alike. Look at Clipper; it was very much a statement language snd they made it work very nicely (too bad 10 years ago Computer Associates bought Clipper and not Microsoft... :)

    Well I tried...
  • > I think VB.NET should be the easiest tools to create readable, robust, and maintainable business systems

    I 100% agree. C-style operators work against that goal by making the language more dense. More dense means less readable and maintainable, and less maintainable means less robust. Short cuts make long delays!
  • >> More dense means less readable and maintainable

    That's where you and I will have to disagree. :) (But then what I mean and what you mean are probably different.)

    Riddle me this? When is it better to have a 92 line VB.NET program instead of a 12 line VB.NET program? When you want to reservice the use of property signatures for future class compatibility instead of using field signatures!

    GOD I HATE THE VB.NET REQUIREMENT FOR 9 LINES OF CODE FOR EACH STUPID PROPERTY!!!

    1. Private _Foo As String
    2. Public Property Foo() As String
    3. Get
    4. Return _Foo
    5. End Get
    6. Set(ByVal Value As String)
    7. _Foo = Value
    8. End Set
    9. End Property


    Why oh why didn't the VB team feel it would be beneficial to simplify the basic standard cases, i.e.:

    1. Public Property Foo As String Uses _Foo

    or even just

    1. Public Property Foo As String

    ?!@?!@?!@?

    To me, being able to see more of the program and not having to scroll hundreds of lines through fluff would make maintenace a lot easier than the expando code required by properties!

    Sorry for the rant. Paul V told me it was a feature considered for Whidbey but not deemed important, and your comments triggered my latent frustration! Sorry. :(
  • Eric, you are wrong. Again! You wrote

    "And with regard to whether k++ returns a reference to k -- in C++, the ++ and -- operators take lvalues and return lvalues, so yes, if VB.NET worked like C++ in this regard, evaluating k++ as an argument would have to pass a reference to k to a function expecting a byref argument."

    Prefix increment/decrement operators in C++ return l-value, postfix increment/decrement return r-value. Therefore, it is illegal to write i++ ++ in C/C++. A r-value cannot be bound to a non-const reference variable in C++. Continuing this reasoning, it is also illegal to call f( k++ ), if k has been declared like this:

    void f( int& );

    but is legal for the cases,

    void f( int );

    or

    void f( const int& );

    VC8 produces a warning for the above code, I don't remember the exact warning level but compile with /W4 option to see it.

  • actually i found this problem existing both in c and c++

    when i wrote

    int a=5

    int b=++a + (++a + a--);

    the answer of b was 21

    when i wrote

    int b,a=5;

    b=++a + (++a + a--);

    the value of b came out to be 18

    n when i tried this same problem in gcc compiler

    the answer was 20

    can anyone explain why the answers are varying?

    please its important

    The ++ operator both returns a value and causes a side effect. The C# specification carefully defines the order in which expressions that have side effects are processed, so that every implementation of C# gives you the same answer.

    However, the C and C++ specifications explicitly do not specify the order in which side effects occur. A conforming C compiler is allowed to run the side effects in your expression in any order it chooses. And therefore any two compilers can disagree on the meaning of this expression. You should therefore avoid such expressions, because you cannot know that the compiler will actually do what you want. 

    If you are interested in this topic, you should look up "sequence point" on wikipedia; that will get you started on understanding exactly what is specified in C and what is left up to the compiler. -- Eric

Page 2 of 2 (29 items) 12