Restating the problem

Restating the problem

Rate This
  • Comments 34

A problem statement:

I am trying to loop though a sequence of strings. How can I determine when I am on the last item in a sequence? I don’t see how to do it with “foreach”.

Indeed, “foreach” does not make it easy to know when you are almost done. Now, if I were foolish enough to actually answer the question as stated, I’d probably say something like this:

You can do so by eschewing “foreach” and rolling your own loop code that talks directly to the enumerator:

IEnumerable<string> items = GetStrings();
IEnumerator<string> enumtor = items.GetEnumerator();
if (enumtor.MoveNext())
{
  string current = enumtor.Current;
  while(true)
  {
    if (enumtor.MoveNext())
    {
      DoSomething(current); // current is not the last item in the sequence. 
      current = enumtor.Current;
    }
    else
    {
      DoSomethingElse(current); // current is the last item in the sequence.
      break;
    }
  }
}
else
{
  // handle case where sequence was empty
}

Yuck. This is horrid. A fairly deep nesting level, some duplicated code, the mechanisms of iteration overwhelm any semantic meaning in the code, and it all makes my eyes hurt.

When faced with a question like that, rather than writing this horrid code I’ll usually push back and ask “why do you want to know?”

I am trying to walk though a sequence of strings and build a string like "a,b,c,d".  After each item I want to place a comma except not after the last item. So I need to know when I’m on the last item.

Well knowing that certainly makes the problem a whole lot easier to solve, doesn’t it? A whole bunch of techniques come to mind when given the real problem to solve:

First technique: find an off-the-shelf part that does what you want.

In this case, call the ToArray extension method on the sequence and then pass the whole thing to String.Join and you’re done.

Second technique: Do more work than you need, and then undo some of it.

Using a string builder, to build up the result, put a comma after every item. Then “back up” when you’re done and remove the last comma (if there were any items in the sequence, of course).

Third technique: re-state the problem and see if that makes it any easier.

Consider this equivalent statement of the problem:

I am trying to walk though a sequence of strings and build a string like "a,b,c,d". Before each item I want to place a comma except not before the first item. So I need to know when I’m on the first item.

And suddenly the problem is much, much simpler. It’s easy to tell if you’re on the first item in a sequence by just setting a “I’m at the first item” flag outside the foreach loop and clearing it after going through the loop.

When I have a coding problem to solve and it looks like I’m about to write a bunch of really gross code to do so, it’s helpful to take a step back and ask myself “Could I state the problem in an equivalent but different way?” I’ll try to state a problem that I’ve been thinking about iteratively in a recursive form, or vice versa. Or I’ll try to find formalizations that express the problem in mathematical terms. (For example, the method type inference algorithm can be characterized as a graph theory problem where type parameters are nodes and dependency relationships are edges.) Often by doing so I find that the new statement of the problem corresponds more directly to clear code.

  • Don't forget .Aggregate((x, y) => x + "," + y);

  • Actually, do forget that - it's not any clearer than String.Join, and using operator+ for strings in a loop is frowned upon for a reason.

  • I realise that this is beside the point, Eric, but I think your loop is slightly more horrid than it needs to be. I'd write the body of the outer if-statement in a different way:

    string current = enumtor.Current;

    // 'current' may or may not be the last element

    while (enumtor.MoveNext())

    {

    // 'current' is not the last element

    DoSomething(current);

    current = enumtor.Current;

    // 'current' may or may not be the last element

    }

    // 'current' is the last element

    DoSomethingElse(current);

    Still plenty horrible, but at least it's less nested.

  • @pminaev: "using operator+ for strings in a loop is frowned upon for a reason."

    It's because every operator+ as used in the .Aggregate() call will generate a string.  This is (of course) by design.

    Unfortunately, it means that you will produce N-1 strings for each such Aggregate() call, where N is the number of elements in your list, so if you have a list with 5 elements, 4 strings will be produced, 3 of which are garbage and will never be used again.

    In short, you'll be making lots of work for the GC for minimal gain.

    A StringBuilder version, in contrast, would (hopefully) produce fewer internal temporary objects (due to amortized array reallocations within StringBuilder), to wit:

    list.Skip(1).Aggregate(

     new StringBuilder().Append(list.First()),

     (buf, e) => buf.Append(",").Append(e),

     buf => buf.ToString());

    I did some performance testing awhile back, and by far the fastest version was ToString() + string.Join().  This results in only O(log N) temporaries (due to ammortized array resizing), and the string.Join() call can efficiently pre-allocate the buffer for the resulting string without generating any temporaries.

  • "For example, the method type inference algorithm can be characterized as a graph theory problem where type parameters are nodes and dependency relationships are edges."

    I'd like to request a full blog post that illustrates the above!

    As a developer I've personally struggled to make the leap from some problem that I need to solve and relate it to some algorithm about graph or trees...

  • In Java, I do this by appending to a StringBuffer, but testing that the buffer is not empty before adding the leading comma.  Granted it doesn't work right with empty strings, but that's often not relevant.

  • I like the following which I found on the web recently:

    StringBuilder SB = new StringBuilder();

    String Comma = "";

    foreach item in container {

     SB.add(comma);

     SB.add(Item);

     Comma = ",";

    }

    result = SB.ToString();

  • String.Join, since it performs the requested work, is by far the best choice.

    I prefer an extension method for call-site neatness:

    public static string Join(this IEnumerable<string> source, string separator)

    {

     return String.Join(separator, source.ToArray());

    }

    public static string Join(this IEnumerable<string> source, string separator, int startIndex, int count)

    {

     return String.Join(separator, source.ToArray(), startIndex, count);

    }

  • I think this is one of the reasons why you often see some rather thoughtless code. Code as if the author went straight to the keyboard after project description etc.

  • I could be way off here, but isn't this really a trivial problem?  Why are you maing it so difficult?

    string newString = string.Empty;

    for (int i = 0; i < items.Count(); i++)

       newString += items[i] + (i == items.Count() - 1 ? string.Empty : ",");

  • **I ignored performance rules above, just showing a simple way to get the solution

  • @naspinski: Since this is an IEnumerable and not an array, getting the Count() is an O(n) operation. which you are doing 2n times, like accessing by index, which is also an O(n) operation. If you are using a ToArray(). you already have a library function to do that anyway.

  • Reverse your sequence, and pick the first item. You now have the item that was originally the last item. You should separately the handle the case of zero length sequence.

  • Well, you could just use foreach to add a comma after each item, then take the concatenated string and pop off the last char.

  • I've always liked (pseudo code):

    if (it.hasNext()) {

     sb.append(it.current);

     while (it.hasNext()) {

       sb.append(',');

       sb.append(it.current);

     }

    }

Page 1 of 3 (34 items) 123