Implicit Line Continuation in VB 10 (Tyler Whitney)

Implicit Line Continuation in VB 10 (Tyler Whitney)

  • Comments 14

Things are always changing.  I was at the Washington State History Museum with my daughter a couple weeks ago.  One of the exhibits features pictures of various sites that were taken many years ago.  Then it contrasts them with contemporary pictures taken of the same locations.  It was really interesting how much things changed and how quickly.

I have a couple computers in my office that I had when I was a kid.  My wife suggested that I move my little museum to a place where it would be more appreciated—which meant out of the house ;-)  So here they sit in my office.  I have an Apple ][+ that is signed by Stephen Wozniak.  He was on campus some time ago and he graciously signed it for me.  But the reason I bring it up is because every once in a while I fire it up to remember what the programming experience was like, way back when.  Having that machine in my office is a nostalgic reminder of how things change. 

And Visual Basic has certainly seen its share of change.  It’s in the context of some of my current work on the VB compiler team that I thought I’d write about a little change we are doing for Dev 10. 

What’s in a line

VB is a line-oriented language.  That is, we use the carriage return as our statement termination token.  You are no doubt familiar with other languages that use an explicit statement terminator--like C which uses the ‘;’.  A carriage return in VB is, for the purposes of analogy anyway, similar to the ‘;’ in C*.  But why have a statement terminator symbol in the first place?

One reason is ambiguity.  A common complaint about compilers is that if the compiler knows that the terminator is missing why not put it in for you rather than bother you with an error?  Part of the reason is that it isn’t necessarily clear where it should go.  The compiler has just reached the end of the road, as far as the current statement goes, and there may have been multiple places along the way where a terminator could have made sense.  If the compiler silently inserted it for you it may be right part of the time.  But it could also be wrong-- silently changing the meaning of your program in ways you didn’t expect.

I tried to explain why we have statement terminator tokens in a recent Channel 9 interview.  The example I used went something like this:

                On Thursday Beth coded feature1 and feature2 and feature3 on Friday.

It’s a bad sentence.  But the issue I’ll focus on is ambiguity.  When did feature2 get written?  It could have been on Thursday.  It could have been on Friday.  We can fix the ambiguity with some punctuation.

                On Thursday Beth coded feature1 and feature2.  And feature3 on Friday.

It’s still ugly.  But punctuation at least addresses the ambiguity issue.

We have the same problem in programming languages.  For instance, what does this mean:

Return  1

+foo()

 

Does it mean Return 1, or does it mean Return 1+foo()?

We can avoid the ambiguity by introducing punctuation to mark the end of each statement, e.g:

Return 1;

+foo();

 

To terminate or not to terminate: that is the question

When you first consider the issue of allowing whitespace in a line oriented language like VB, it seems like it would boil down to letting the scanner eat all of the whitespace and be done with it.  But the problem is more complex.  One way to think about the issue is to put the same problem in a different context.

VB uses the carriage return as a statement terminator.  C# uses the semi-colon.  Attempting to make VB read through carriage returns as if they were expendable whitespace is similar to getting C# to read through semi-colons as if they didn’t always mean we are at the end of a statement.  Parsing through carriage returns in VB for this:

                Dim x as integer = 1 +

                2

 

Is roughly like trying to parse this in C#

 

                int x =  1 +;

                2;           

 

The problem is being able to tell when a statement completion token means that we are at the end of the statement vs. when it doesn’t.  We have to approach an existing grammar and decide how to provide this flexibility without creating a lot of risk for all the existing code out there that will be compiled by the new compiler.

 We decided to mitigate risk and keep the feature simple by limiting implicit line continuation to easy-to-understood cases.  We choose tokens where it would be easy to infer that an implicit line continuation could occur.  For instance, it is clear that x = 1+  isn’t a ‘finished’ statement.  So when we parse the ‘+’ we will peek through the statement terminator (the carriage return) to see if we can continue the expression.

We don’t capture every scenario.  Given our cost and time constraints around the feature, we tried to capture the most common cases that would provide the most bang for the buck.  We also avoided the ones that just led to problems.  Here are some examples of problems you could have if we had decided to allow implicit continuation anywhere.  I take these from some analysis that Lucian Wischik (also on the VB compiler team) did on our grammar:

With y

                A=x

                .xfield

End With

 

If we allowed implicit continuation before the ‘.’ we would have problems knowing what the period belongs to.  For example, it could be interpreted as:

                With y

                                A=x.xield

                End With

Or

                With y

                                A=x

                                .xfield

                End With

 

If we allowed whitespace after every keyword, we could run into problems where a set of tokens could be interpreted in different ways:

 

Do

                While condition

                End While

Loop

 

Could also be interpreted as:

 

                Do While condition

                End While ß this would become a syntax error

                Loop

 

Here’s some more fun with keywords and whitespace.  Given the following:

 

                Do

                Loop

Until

Foo

 

Should it be interpreted as:

                Do

                Loop Until Foo

 

Or instead as:

 

                Do

                Loop

                Until ß a method call

                Foo ß another method call

 

And finally for a (even more) contrived example:

 

Sub Main()

End

Sub

 

Is this an End statement inside a Sub Main()and the user has just started typing in a new Sub?  Or is it End Sub? Remember that we don’t just get to parse the finished text.  We have to parse as the user is entering the text in the first place so we can offer appropriate Intellisense.

 

And so it goes for other examples. 

 

There is still a place for the explicit line continuation character.  You may have occasion to use it when you want to split a line in way that implicit line continuation doesn’t accommodate.   For instance:

 

‘This works:

Dim list As New List(Of Integer) From

{1, 2, 3, 4, 5}

 

But the following doesn’t because we don’t allow a continuation before the From keyword in this context.  We need to use an explicit line continuation here:

 

Dim list As New List(Of Integer) _

From {1, 2, 3, 4, 5}

 

So sweet was ne'er so fatal – Othello Act V, Sc. II

I was asked a question on the Channel 9 interview about how this feature is tested since it seems like it could be one of these hair-pulling things to make sure we haven’t broken anything.

One help is that we have the advantage of having a good test bed.  For language specific tests alone, we have about 25,000 tests covering 1.4 million scenarios.  Our testers created a tool that can inject carriage returns into some of our existing tests after the tokens we know can imply line continuation.  Then the test is run to make sure it compiles and runs the same way it did before.  There is also the testing that is done by a tester armed with the spec and the grammar, who tries to find ways to break the compiler and the intellisense experience.  Tests are also hand-crafted to test various line-continuation scenarios.

It is gratifying to finally get implicit line continuation into the language.  There has been a desire to do it for some time, but it usually had to take a place in line behind other priorities.  But now it will see the light of day.

It’s fascinating how things have changed over the years.  Hopefully for the better.  When the Visual Studio 10 beta becomes available I hope you’ll give implicit line continuation a shot and let us know how it goes.

-Tyler

Leave a Comment
  • Please add 6 and 1 and type the answer here:
  • Post
  • I just posted a new Channel 9 interview on a nifty little feature which isn't so little when you look

  • In this interview, Tyler Whitney , a developer on the Visual Basic compiler team, demonstrates how line

  • Add in a reformat VB to new style would be great:

    - Remove line contiuations

    - Make each line of code a single line in the file (e.g., to fix old coding style of putting each argument to a function on a seperate line)

    - Trim trailing spaces (don't trim the trailing line at end of the file - it's needed for processing source code with text filtering type tools)

  • It's a minor thing, but it's +really+ nice to see this is finally coming to light in VB.

    Looking forward to v10

  • Congratulations. I'm thrilled that you guys were able to get this stuff in. It should be pretty. It sounds like you can Lucian did some awesome work.

  • Congratulations. I'm thrilled that you guys were able to get this stuff in. It should be pretty cool. It sounds like you and Lucian did some awesome work.

  • Greg - that's a good idea.  I'll remember that one when we start triaging ideas for Dev 11.

  • To be honest: I rather prefer implicit mechanisms to force coders to produce readable code. BTW, there is another line terminator in BASIC: the ever annoying colon.

    I'm coding BASIC for 25 and RPG for 18 years now. While System i RPG definitely needs improvements, VB.NET coding is close to perfect with VS2008. Source code is not just a matter of typing, also a matter of readability after years.

  • Great work! I'd like to see an addition however. I'd like to be able to place my commas on the beginning of the next line, instead of at the end of the line.

    eg

    Public sub Test(

         firstParam as integer

       , secondParam as integer

       , thirdParam as string

       )

       'code goes here

    End sub

    Will there be ongoing work on this feature, or is it considered "complete"?

    Thanks,

    Yann

  • Hi Yann,

    We are at the point where we are trying to limit the feature changes we make.  

    We probably won't go this direction because we (generally) prefer to decide whether to eat the line break based on what the last token on the line is.  In this case there isn't anything about a type name token that suggests that an implicit line continuation should be implied.  We'd also have to update all comma delimmited list processing sections of the parser to make it consistent.

    For now, at least, I think we consider language facing changes around line continuation complete for this version.

    Best regards,

    Tyler

  • Does the lin continuation allow comments?

    as in

    x = foo(

               y,  'why not

               z   ' because

              )

  • Hi John,

    Unfortunately, inline comments are not allowed at this time.

    Thanks,

    Lisa

  • As a big vb fan, I'm just now jumping into vb 10.  I would love to see a brief, working, directx 9.0 example written in vb 10.  Thanks.

    Keep up the good work team !!!!

  • There is only one right way to do this.

    Use a option like STRICT, something like:

    SET OPTION NEWLINE SOFT

    SET OPTION NEWLINE HARD

    I prefer the current syntax. The suggested changes sound cumbersome.

    IMO it should have been implemented using some type of SYNTAX marker.

    >>>SOFTMODE ON

    "code goes here"&

    "and here"

    <<<SOFTMODE OFF

    Most importantly the new feature should allow you to do:

    "TEXT TEXT TEXT

    TEXT TEXT"

    Instead of having to do:

    "TEXT TEXT TEXT"& vbNewLine & _

    "TEXT TEXT"

Page 1 of 1 (14 items)