Spot the defect: Bad comparisons, part two

Spot the defect: Bad comparisons, part two

Rate This
  • Comments 21

Suppose I want to sort a bunch of strings into order first by length, and then, once they are sorted by length, sort each group that is the same length by some other comparison. We can easily build such a device with higher-order programming:

static Comparison<string> FirstByLength(Comparison<string> thenBy)
{
  return (string x, string y) =>
  {
    // Null strings are sorted before zero-length strings; remember, we need to provide a total ordering.
    if (x == null && y == null)
      return 0;
    if (x == null)
      return -1;
    if (y == null)
      return 1;
    if (x.Length > y.Length)
      return 1;
    if (x.Length < y.Length)
      return -1;
    // They are the same length; sort on some other criterion.
    return thenBy(x, y);
  };
}

Super. This idea of composing new comparison functions out of old ones is pretty neat. We can even built a reversal device:

static Comparison<string> Reverse(Comparison<string> comparison)
{
  return (string x, string y) => -comparison(x, y);
}

Something is subtly wrong in at least one of these comparison functions. Where's the defect?

.

.

.

.

.

.

.

.

Let's restate that contract again. The comparison function returns a negative integer if the first argument is smaller than the second, a positive integer if the first is greater than the second, and zero if they are equal. Any negative integer will do, and in particular, Int32.MinValue is a negative integer. Suppose we have a bizarre comparison function that returns Int32.MinValue instead of -1:

Comparison<string> bizarre = whatever;

and we compose it:

Comparison<string> reverseFirstByLength = Reverse(FirstByLength(bizarre));

Suppose two strings are equal in length and bizarre returns Int32.MinValue for those strings. Reverse should return a positive number, but -Int32.MinValue either throws an exception (in a checked context) or returns Int32.MinValue right back (in an unchecked context). Remember, there are more negative numbers that fit into an integer than positive numbers, by one.

The right implementation of Reverse is either to spell it out:

static Comparison<string> Reverse(Comparison<string> comparison)
{
  return (string x, string y) =>
  {
    int result = comparison(x, y);
    if (result > 0) return -1;
    if (result < 0) return 1;
    return 0;
  };
}

Or, as some commenters noted, to swap left and right:

static Comparison<string> Reverse(Comparison<string> comparison)
{
  return (string x, string y) => comparison(y, x);
}

Next time: One more defect to spot.

  • Just in case anyone was thinking this kind of mistake would never make its way into production code, it has. The three implementations of LINQ to Objects which I suspect are most widely used (the one in .NET, Mono's implementation, and LinqBridge) *all* have bugs around OrderByDescending due to this. Joy.

  • @Jon, haha, wow. Yeah, I reproduced it. For those playing at home, use the overload that accepts an IComparer<T> implementation. Have it return int.MinValue where you would normally return -1, and experience the joy.

  • Do any of the VS project templates enable checked code in the Debug build? (C# Build->Advanced->"Check for arithmetic overflow/underflow")  I feel that they should to help catch such errors.  Hmm does VB enable 'checked' by default in all builds? (Compile->Advanced->"Remove integer overflow checks")

    I deal with bit parsing etc a lot so have learned to add checked and unchecked blocks/operators to my code but I guess they don't occur to most people.

  • I don't know if it's relevant, but it's the only documentation I found on MSDN regarding constraints on comparison functions:

    """Notes to Implementers

    Comparing null with any reference type is allowed and does not generate an exception. A null reference is considered to be less than any reference that is not null.

    """

    (Regarding Comparer<T>.Compare).

    The Reverse method doesn't preserve this property (although I can't imagine where it would matter).

  • "The Reverse method doesn't preserve this property (although I can't imagine where it would matter)."

    I'm not sure what you mean. It is true that after reversing a comparison that meets that criteria, the new comparison no longer does. But that seems "by design" to me. After all, if null objects remained at the beginning of the sort order after reversing, it seems to me you haven't _truly_ reversed the original sort order completely.

    I can see good arguments both ways, but at the end of the day it seems to me that the most common usage for a Reverse() composition is when you literally want the entire original sort order reversed. It won't comply strictly to the interface docs, but I think that's more because of an oversight in the docs, than because Reverse() is broken.

  • You're right. The problem is probably in this criterion, since it doesn't allow you to reverse lists that contain non null references.

Page 2 of 2 (21 items) 12