Fabulous Adventures In Coding

Eric Lippert's Blog

Quibbling Over Semicolons

In a comment to my last entry, a reader said:

In C/C++/Java/C# statements must end with a semicolon. JavaScript OTOH allows the use of a newline as a statement separator. Presumably this was done so that it would be easier to use for scripters. Personally, I hate the ambiguities it generates.

I'm with you -- I find it irritating as well.  But we're stuck with it now.

Let me clarify a bit what is actually going on here.  The statement "a newline is a legal statement separator" is actually not the best way to think about this language feature.  A better way to think about it is that JScript requires some statements to end in a semicolon but JScript will automatically insert missing semicolons. The net result of both statements is the same, but I find it easier to think about semicolons as required and the compiler as automatically fixing some mistakes.

First off, what statements require semicolons?  The empty statement (yes, a single semicolon is a legal statement!), a var declaration, an expression evaluation, a do-while, continue, break, return and throw require semicolons.

The automatic semicolon inserter scans the program looking for places that semicolons are required but missing. It automatically inserts the semicolon provided that:

  • the missing semicolon goes before a newline or a right-curly-brace or the end of the program.
  • adding a missing semicolon does not create an empty statement.
  • adding a missing semicolon does not screw up the arguments to the "for" loop.

This leads to a few bizarre situations, because programs like these are now hard to parse:

a
++
b

is that

a
++;
b;

or

a;
++
b;

?  To disambiguate this,  JScript restricts where newlines can go.  You can't put a newline in the middle of a ++, --, return, throw, break or continue.  For example, the automatic semicolon inserter turns this:

return
a++

into

return;
a++;

Auto semi insertion can bite you.  Consider for example:

Number.prototype.blah = function(){ /* whatever */ }
var d = 1, e = 2
var a = d * e
(d + e).blah()

Auto semicolon insertion turns that into

Number.prototype.blah = function(){ /* whatever */ };
var d = 1, e = 2;
var a = d * e                <-- no semi here!
(d + e).blah();

because of course

var a = d * e(d + e).blah();

is perfectly legal.  Nonsensical at runtime but syntactically legal.

My advice is to use semicolons rather than relying upon the crazy rules for the auto semi inserter.

 

Published Monday, February 02, 2004 1:55 PM by Eric Lippert
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

David Cumps said:

Completely agree
don't be lazy :)
February 2, 2004 2:04 PM
 

Louis Parks said:

What was the reasoning behind allowing devs to not use semis in the first place?
February 2, 2004 2:25 PM
 

mike said:

Do JScript and JavaScript (ECMAScript, perhaps) differ in this at all?
February 2, 2004 10:50 PM
 

Eric Lippert said:

The ECMA spec mandates auto semi insertion, and to my knowledge, JScript and JavaScript both implement it consistently and according to the spec.
February 3, 2004 12:08 AM
 

Dan Shappir said:

Here is another example of an ambiguity stemming from this "feature" (and also from the overloaded meaning of the curly brackets):

{ foo: bar() }

Is this an anonymous object, with a single member foo equal to the value returned by bar()? Or maybe this is a code block with a labeled call to bar()? How should JScript (or ECMAScript) behave if it encounters a file with just this text?

I would have much preferred mandatory semi-colons to such ambiguities. Indeed, I would have been in favor of a mandatory use of var to declare variables (even members). These are two examples where trying to make something simpler ends up making it more complex.

Dan (a reader)
February 3, 2004 12:45 AM
 

Eric Lippert said:

> How should ECMAScript behave if it encounters a file with just this text?

I see what you're getting at, but actually, this has little to do with auto semi insertion. Basically you're asking should the auto semi inserter do this

{ foo : bar() }; // object literal

or this

{ foo : bar(); } // labeled statement in a block

?

But in ECMAScript the former isn't even grammatical, so the auto semi inserter will never produce it! The parser never even considers that it might be an object literal.

This is not ambiguous because it is simply not grammatical to have a statement consisting of a single expression that begins with a left curly. See section 12.4 of ECMA 262 Revision 3 for the details.
February 3, 2004 11:08 AM
 

Fabulous Adventures In Coding said:

April 28, 2004 5:02 PM
 

C# Nuggets said:

Javascript is a dynamic language . And that&#39;s a problem. On the one hand it allows you virtually

January 22, 2007 3:44 PM
 

Aaron Moore said:

Python seems to get along fine without them.

September 25, 2008 4:36 PM

Leave a Comment

(required) 
(optional)
(required) 
Submit

About Eric Lippert

Eric Lippert is a senior developer on the Microsoft C# compiler team. Before that he worked on the framework of Visual Studio Tools For Office. Before that, he worked on the compilers, runtimes and tools for VBScript, JScript, Windows Script Host and other Microsoft Scripting technologies. He lives in Seattle and spends his free time editing books about programming languages, playing the piano, and trying to keep his tiny sailboat upright in Puget Sound.

This Blog

Syndication


© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker