Out parameters and LINQ do not mix

Out parameters and LINQ do not mix

Rate This
  • Comments 28

I am back from my annual vacation in beautiful southwestern Ontario; before I get into the subject of today's post, check out this shot I took with my Windows Phone camera from the plane on the trip home. We are at 37000 feet, just outside of Billings, Montana, a few minutes before sunset:

WP_000062

The whole thing was chock full of immense lightning arcs which unfortunately I did not capture in the image. This is certainly the largest isolated thunderstorm I've ever seen from the outside. Notice the characteristic anvil shape; as I've described before, we've got a huge heat engine here that is extracting the latent heat from the gaseous and liquid water, and then using that heat to power the updraft that sucks more warm water vapor upwards. Quite beautiful.

Well, enough chit-chat about the weather. Today: what's wrong with this code?

var seq = new List<string> { "1", "blah", "3" };
int tmp;
var nums =
  from item in seq
  let success = int.TryParse(item, out tmp)
  select success ? tmp : 0;

The intention is pretty clear: take a sequence of strings and produce a sequence of numbers by turning the elements that can be parsed as numbers into those numbers and the rest into zero.

The C# compiler will give you a definite assignment error if you try this, which seems strange. Why does it do that? Well, think about what code the compiler will translate the last statement into:

var nums =
  seq.Select(item=>new {item, success = int.TryParse(item, out tmp)})
     .Select(transparent => transparent.success ? tmp : 0);

We have two method calls and two lambdas. Clearly the first lambda assigns tmp and the second reads it, but we have no guarantee whatsoever that the first call to Select invokes the lambda! It could, for some bizarre reason of its own, never invoke the lambda. Since the compiler cannot prove that tmp is definitely assigned before it is read, this program is an error.

So is the solution then to assign tmp in the variable declaration? Certainly not! That makes the program compile, but it is a "worst practice" to mutate a variable like this. Remember, that one variable is going to be shared by every delegate invocation! In this simple LINQ-to-Objects scenario it is the case that the delegates are invoked in the sensible order, but even a small change makes this nice property go out the window:

int tmp = 0;
var nums =
  from item in seq
  let success = int.TryParse(item, out tmp)
  orderby item
  select success ? tmp : 0;
foreach(var num in nums) { Console.WriteLine(num); }

Now what happens?

We make an object that represents the query. The query object consists of three steps: do the "let" projection, do the sort, and do the final projection. Remember, the query is not executed until the first result from it is requested; the assignment to "nums" just produces an object that represents the query, not the results.

Then we execute the query by entering the body of the loop. Doing so initiates a whole chain of events, but clearly it must be the case that the entire "let" projection is executed from start to finish over the whole sequence in order to get the resulting pairs to be sorted by the "orderby" clause. Executing the "let" projection lambda three times causes "tmp" to be mutated three times. Only after the sort is completed is the final projection executed, and it uses the current value of tmp, not the value that tmp was back in the distant past!

So what is the right thing to do here? The solution is to write your own extension method version of TryParse the way it would have been written had there been nullable value types available in the first place:

static int? MyTryParse(this string item)
{
    int tmp;
    bool success = int.TryParse(item, out tmp);
    return success ? (int?)tmp : (int?)null;
}

And now we can say:

var nums =
  from item in seq
  select item.MyTryParse() ?? 0;

The mutation of the variable is now isolated to the activation of the method, rather than a side effect that is observed by the query. Try to always avoid side effects in queries.


Thanks to Bill Wagner for the question that inspired this blog entry.

  • >> The solution is to write your own extension method version of TryParse the way it would have been written had there been nullable value types available in the first place

    But Int32.TryParse (and the whole Try... pattern in general) was written when nullable value types were already available - it was added in .NET 2.0!

  • I would prefer

    static int ParseOrDefault(this string item)

    {

      int tmp;

      if (int.TryParse(item, out tmp)) return tmp;

      return 0;

    }

    There's no way to make that generic without reflecting a TryParse method is there?

  • Or maybe we can make it generic...

       static T ParseOrDefault<T>(this string text)

       {

           TypeConverter converter = TypeDescriptor.GetConverter(typeof(T));

           if (converter != null && converter.CanConvertFrom(typeof(string)) && converter.IsValid(text))

               return (T)converter.ConvertFrom(text);

           return default(T);

       }

  • In your post you say: "the way it would have been written had there been nullable value types available in the first place". Since nullable types have been around for 7 years or so, did Microsoft thought of adding overloads of the TryParse methods taking a single string parameter and returning a nullable type? Or did it consider it was not worth the time to design/implement/test/etc. ?

  • Just wrap the TryParse into a version that returns a Tuple just like F# does and voila - Problem solved.

  • The cost of having "your own" extension methods is huge. Your code is now harder to read by other developers. It's so easy to assume an extension method is a well-tested framework method.

  • @Andomar:

    I don't agree. Nothing stops this extension from being "well-tested". Adding unit tests for such extension methods is easy because there are no external dependencies. And since it's a thin wrapper over an existing framework method, the risks of having bugs in this method are slim in the first place.

    Having some custom extension methods helps me simplify my code, which gets easier to read, not harder.

  • A while back, mostly for fun, I implemented something similar, but purely in LINQ:

    public static readonly Func<string, int?> TryParseInt = ((Func<int, string, int?>)((seed, value) => int.TryParse(value, out seed) ? seed : (int?)null)).Curry()(default(int));

    Where Curry is a series of functions with the following pattern. Only the second one is needed for this particular example.

    public static Func<T1, T2> Curry<T1, T2>(this Func<T1, T2> function)

    {

       return a => function(a);

    }

    public static Func<T1, Func<T2, T3>> Curry<T1, T2, T3>(this Func<T1, T2, T3> function)

    {

       return a => b => function(a, b);

    }

    Pretty horrible to read, but that was not my main objective at the time :)

  • var nums =

     from item in seq

     select Regex.IsMatch(item,"^-?(\d)*$") ? int.parse(item) : 0

  • "The solution is to write your own extension method version of TryParse the way it would have been written had there been nullable value types available in the first place"

    Nullable value types *did* exist when Int32.TryParse was introduced, in .NET 2.0. Any idea why they didn't go for it at that point?

  • (Doh - this is what I get for not reading comments before posting one. An argument in favour of having the comment box *under* the existing comments, of course... but that's no excuse.)

  • Wasn't Double.TryParse there pre-2.0?  I seem to remember there being one TryParse, then the others were introduced in 2.0

  • @Patrick: Good call. It's not listed on msdn.microsoft.com/.../994c0zb1.aspx ("Supported in: 4.5, 4, 3.5, 3.0, 2.0") but it *did* exist apparently:

    msdn.microsoft.com/.../system.double.tryparse(v=vs.71)

    I think I'd argue it would have been better to deprecate the old method than to propagate the nastiness though :)

  • Yes!  Double.TryParse (and, I think maybe DateTime.TryParse) did exist in .NET 1.0.  I submitted a request through the old MSDN Product Feedback (Connect pre-cursor) to add similar TryParse methods to all of the BCL value types way back in 2003 or so.  I'd hazard a guess that the new TryParse implementations were done before the final decision to add nullable types in 2.0.

  • Silly comment system!  Or silly user - the previous comment was in fact from me.

Page 1 of 2 (28 items) 12