Never Say Never, Part One

Never Say Never, Part One

Rate This
  • Comments 27

Can you find a lambda expression that can be implicitly converted to Func<T> for any possible T?

.

.

.

.

.

.

.

.

.

.

.

Hint: The same lambda is convertible to Action as well.

.

.

.

.

.

.

.

.

.

Func<int> function = () => { throw new Exception(); };

The rule for assigning lambdas to delegates that return int is not "the body must return an int". Rather, the rules are:

* All returns in the block must return an expression convertible to int.
* The end point of the block must not be reachable.

Both those conditions are met. The first one is vacuously met; zero out of zero returns meet the condition, so that's all of them. The second one is met because the compiler can deduce that no possible code path hits that end brace. Either "new Exception()" throws, or it goes into an infinite loop, or it succeeds and its value is thrown; no matter what, there's no possibility of the function completing normally. Of course the conditions are met for any type argument, not just int.

Similarly, it's assignable to Action because the rule for Action is simply that every return in the block must not have an expression. Again, that condition is met vacuously.

The rule for lambdas is just a special case of the rule for regular functions. This is perfectly legal, for precisely the same reasons:

int I.M()
{
  throw new NotImplementedException();
}

Why then is this application of the "extract method" refactoring not legal?

private static void AlwaysThrows()
{
  throw new NotImplementedException();
}
int I.M()
{
  AlwaysThrows();
}

The problem here is that the C# compiler does not perform interprocedural control flow analysis. We do analysis of one method body at a time, and we trust the declared return type of the method to be accurate. The declared return type of AlwaysThrows is void, and void means that it returns no value, but it does possibly return. Therefore, the end point of the call to AlwaysThrows is reachable, and therefore the end point of M is reachable without returning an integer. You and I both know that it is not reachable, but the compiler is not sophisticated enough to know that.

Of course, this is a silly example, but it doesn't take much to turn this into a realistic example. You see this sort of thing in unit testing frameworks all the time:

Frog frog;
try
{
    frog = Animals.MakeFrog();
}
catch(Exception ex)
{
  LogAndThrowTestFailure(ex); // always throws
}
frog.Ribbit();

The compiler complains that frog.Ribbit() is illegal because MakeFrog might have thrown before frog was assigned, and LogAndThrowTestFailure -- which we know always throws but the compiler doesn't know that -- might have returned normally, in which case frog is not definitely assigned at the point of the call. If instead it had been

catch(Exception ex)
{
  throw LogTestFailureAndReturnAnotherException(ex);
}

then the compiler would correctly reason that the call to Ribbit is only reachable if the assignment succeeded.

What, if anything, can we do about this?

In practice, nothing. You've got to write something like

int I.M()
{
  AlwaysThrows();
  return 0;
}

to shut the compiler up, or make AlwaysThrows return the exception and then throw it.

What about in theory? Is there anything the language designers could have done to ease this burden?

As I mentioned before, we could do interprocedural analysis, but in practice that gets real messy real fast. Imagine a hundred mutually recursive methods that all go into an infinite loop, throw, or call another method in the group. Designing a compiler that can logically deduce reachability from a complex topology of calls is doable, but potentially a lot of work. Also, interprocedural analysis only works if you have the source code for the procedures; what if one of these methods is in an assembly, and all we have to work with is the metadata? (Moreover, as we'll see next time, even interprocedural flow analysis is insufficient to solve the problem in general.)

What we need to solve this problem without interprocedural analysis of source code is a another kind of return type. The CLR supports three kinds of return types today. You can return values of value type or reference type, like int or string. You can return nothing, in which case the method is marked "void". Or you can return an alias to a variable. (C# does not support this latter feature; C# only supports "ref" on variables going in to a method call, but we could also support ref on variables coming out if we chose to. Don't hold your breath while waiting for it.) What we need is a fourth kind of return type, the "this method never returns normally" return type. Such a method would have to contain no returns whatsoever and not have a reachable end point. We could know that it does not have a reachable end point by checking whether on every possible code path it always throws, always goes into an infinite loop, or always calls another "never" method.

Some programming languages do have a "never" return type; Curl, for example. A similar function annotation has also been proposed for ECMAScript. But since doing it properly in C# requires support from the verifier in the CLR, it's unlikely that it will become a feature of mainstream CLR languages. Particularly when there are such easy workarounds for the rare circumstances in which you are calling a method that never returns. (*)

Next time: could we be more clever? Just how clever can we be?

(*) For additional thoughts on programming styles in which methods never return, see my long series of articles on Continuation Passing Style.

  • Can't you solve by using a postcondition like Contract.Ensures? I would love to see support for Contract Programming in C#.

  • Actually, it's possible to refactor such methods into not throwing the exception but returning it to be thrown by the caller.

    private static Exception AlwaysThrows()

    {

     return new NotImplementedException();

    }

    int I.M()

    {

     throw AlwaysThrows();

    }

    That will result in a differrent exception call stack but that should not be a problem for such a scenario.

  • Actually you can even throw exception in AlwaysThrows and never return, it will preserve the call stack then.

  • If you wanted to support this feature entirely within C# without CLR-verifier support, it could be done with an attribute on the method. The difference between 'returns void' and 'never returns' only matters for reachability analysis which is only done at the compiler level, no?

    Even if not, it could still be done at the compiler level with some hackery; you could have the method (with an attribute) return the exception to be thrown, and make the compiler translate every call to a method with that attribute automatically translate into a throw of the result of calling that method... I think...

  • What would be the point of supporting ref return types? I can't think of any use case...

  • @Ihar - I think it would be cleaner to make AlwaysThrows() generic (with signature T AlwaysThrows<T>()), and then use

    "return AlwaysThrows()" instead of "throw AlwaysThrows()".

    @Thomas - one example is the Address method on the T[] type.

  • I would disagree with your "In practice, nothing" and meaningless return. I've got three different patterns I've used in these situations to good effect.

    Pattern #1: Give the extracted method a return type, even though it always throws. E.g.,

    private static int AlwaysThrows() {

     throw new NotImplementedException();

    }

    int I.M() {

     return AlwaysThrows();

    }

    This is a decent pattern, since it doesn't require the caller to even know that the extracted method always throws. Whoever is writing I.M just writes code normally.

    Pattern #2: Sometimes I'll have a lot of methods that all throw the same exception, and I want to make a factory method for that exception. (For example, when I'm implementing an interface that's wider than I need, so there are a bunch of methods where I will simply throw a NotSupportedException with a meaningful message.) In that case, I'll typically make the inner method return an exception instance:

    private static Exception AlwaysThrows() {

     return new NotSupportedException("Not supported by " + GetType().Name);

    }

    int I.M() {

     throw AlwaysThrows();

    }

    This works well as long as whoever's writing the calling method knows that they're always supposed to prefix calls with "throw". But that kind of implicit contract can get error-prone in a hurry. For that reason, I usually only use this pattern for interface methods that I don't intend to support.

    Pattern #3: Give the extracted method a return type of Exception, but still have it always throw. That way, the call stack still comes from the extracted method, which will usually give you the best possible information about the failure; but the calling method can still declare (to the compiler) its intent that control will not proceed past the call. If the caller chooses not to throw (or forgets to), there are no ill consequences.

    private static Exception AlwaysThrows() {

     throw new NotSupportedException("Not supported by " + GetType().Name);

    }

    int I.M() {

     throw AlwaysThrows();

    }

  • In what why is verifier support required?  This code compiles to verifiable MSIL:

    public ref struct NoReturn

    {

    __declspec(noreturn) static void DoesNotReturn() { throw gcnew System::Exception(); }

    static int Another() { DoesNotReturn(); }

    };

    Yes the compiler uses a trick to achieve verifiability, but so could C#.

  • The generated MSIL is available at http://codepad.org/pFCwpCPD (to avoid being a spoiler).

    Oh, and the C++/CLI code I gave is also warning-free.

  • This is clearly beside the point of the post, but the first thing I thought of when looking at the initial question was:

    () => default(T)

    Are there situations where this would not work?

  • @Shawn: where would T come from in the lambda?

  • @Ben: it looks like C++/CLI does not preserve __declspec(noreturn) in assembly metadata. So if you define the method in one assembly or module, and reference it from another via #using, this no longer works.

  • I don't see why a distinct bottom type would be necessary for this. For an expression-centric language such as F#, sure, it's handy to have one if only to express the type of a throw-expression - though F# just says it's 'a forall 'a, i.e. universally substitutable, which is good enough for type analysis. Same can be done in C#, in fact.

    The reason why you need that extra piece of information about the fact that the value is never going to be returned is solely due to reachability analysis, and that exists only because C# distinguishes statements and expressions - again, to contrast versus F#, in the latter you _cannot_ avoid returning a value from a function, because it entire body is an expression that has a value - and either the types match, in which case it's all good, or the types don't match, in which case it's an error.

    For C#, it could just as well be done by some attribute placed on a method to indicate that it never really returns (separately from its declared return type), similar to __declspec(noreturn) in VC++. And, of course, the compiler can just insert "ret" in IL as needed to achieve verifiability - there's no reason to bother CLR with it all, it's a higher-level concept and can be perfectly well expressed in terms that CLR understands already.

  • @Pavel:  Frankly, I didn't think very hard about it, because I don't know enough about generics and when an explicit cast is happening even though it's not explicitly specified with the cast syntax. (I also don't know what Eric meant by "implicitly converted", so I was guessing "implicitly cast.")

    My guess was that it wouldn't be a big leap from something like this:

    static void Main()

    {

    var function = AssignLambda<int>();

    Console.WriteLine(function());

    Console.ReadLine();

    }

    static Func<T> AssignLambda<T>()

    {

    return () => default(T);

    }

    to something that would infer T based on the context of the call, but of course that could be wrong.

    Regardless, is it effectively the same thing to say that explicitly specifying int in this situation is also explicitly converting to Func<int>?

  • Ah, I see the source of confusion.

    Eric's question isn't really about C# generics, he had simply used similar terminology. The point is to come up with such a lambda expression that someone else can put in a context where Func<whatever> is expected and have it compile. In other words, if $lambda is a sequence of tokens representing said lambda expression, then it should be valid for me to do:

     Func<int> x = $lambda;

     Func<string> y = $lambda;

     Func<bool?> z = $lambda;

    etc (note that $lambda in all of the above should be the exact same token sequence!). Now as one special case of the above, it should be legal to write:

     void Foo<T>() { Func<T> = $lambda; }

    which is the only case where "T" as a type parameter actually comes up in the code. But it's not really special compared to all others.

Page 1 of 2 (27 items) 12