Comma Quibbling

Comma Quibbling

Rate This

[UPDATE: Holy goodness. Apparently this was a more popular pasttime than I anticipated. There's like a hundred solutions in there. Who knew there were that many ways to stick commas in a string? It will take me some time to go through them all, so don't be surprised if it's a couple of weeks until I get them all sorted out.]

Comma The point of Monday’s post about comma-separated lists was not so much about the actual problem; it’s a rather trivial problem. Rather, I wanted to make two points. First, stating the actual problem rather than a much harder and more general version of the problem is likely to get you a realistic solution to your actual problem much faster. And second, reworking the statement of the problem into an equivalent but structurally different statement is a great way to see solutions that you might have otherwise missed.

But whenever I make a post illustrating such points with a specific example, lots of people pipe up with their ideas for how to solve the specific example. Which is awesome; I encourage this behaviour.

So in that spirit, here’s a slightly harder version of the string concatenation problem, just for the fun of it. Write me a function that takes a non-null IEnumerable<string> and returns a string with the following characteristics:

(1) If the sequence is empty then the resulting string is "{}".
(2) If the sequence is a single item "ABC" then the resulting string is "{ABC}".
(3) If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "{ABC and DEF}".
(4) If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)

I think you get the idea. You can post your solution in the comments or use the link on the blog page to email your solution to me.

The strings in the sequence can be assumed to be non-null but can otherwise be any string value, including empty strings or strings containing commas, braces and "and".

There’s no size limit on the sequence; it could be tiny, it could be thousands of strings. But it will be finite.

All you get are the methods of IEnumerable<string>; if you want to make that thing into a list or an array, you’re going to need to do that explicitly rather than casting it and hoping for the best.

I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.

Of course, C# is most interesting to me, but if there are neat ways to express this in other languages, I’d love to see them too.

If there are any particularly amusing or interesting implementations I’ll dissect them on the blog in a future episode, probably in a week or so. I’m not going to have time to do a detailed analysis of every one.

And… go!

  • Here are a few variations on a theme:

    string[] strings = { "ABC", "DEF", "G", "H" };
    IEnumerable<string> separators = new[] { "", " and " }
    .Concat(Enumerable.Repeat(", ", int.MaxValue));
    string result = "{" + strings
    .Reverse()
    .Select((str, index) => str + separators.ElementAt(index))
    .Aggregate((s1, s2) => s2 + s1) + "}";
    //---------
    int position = 0;
    StringBuilder result2 = strings.Reverse().Aggregate(new StringBuilder("{}"),
    (sb, str) => sb.Insert(1, position++ == 1 ? " and " : position > 1 ? ", " : "").Insert(1, str));

  • I went for clarity over efficiency.

           public static string Format(IEnumerable<String> words)
           {
               int count = words.Count();
               if (count == 0) return "{}";
               if (count == 1) return string.Format("{{{0}}}", words.Single());
               string commaDelimited = words.Take(count - 1).Aggregate((list, word) => string.Format("{0}, {1}", list, word));
               return string.Format("{{{0} and {1}}}", commaDelimited, words.Last());
           }

  • I'm not particularly familiar with all the available LINQ methods, but I imagine that if "readability" is what you're going for, a LINQ based approach would be the best. That said, if you are considering the case of many thousands of strings in enumerable, possibly being called many thousands of times, then you want a case that iterates only once all the way through while still being readable. Thus, any one of the above suggestions that meet the criteria are just as good (Jon Skeet's is my current favorite choice).

  •        static string GetCombined(IEnumerable<string> strings)
           {
               string opening = "{";
               var builder = new StringBuilder(opening);
               var enumerator = strings.GetEnumerator();
               bool hasNext = enumerator.MoveNext();
               while (hasNext)
               {
                   string s = enumerator.Current;
                   hasNext = enumerator.MoveNext();
                   if (builder.Length > opening.Length) // after the opening curly brace
                       builder.Append(hasNext ? ", " : " and ");
                   builder.Append(s);
               }
               builder.Append('}');
               return builder.ToString();
           }

  • >> Extension methods <<

           public static void IterateIndex<T>(this IEnumerable<T> items, Action<int, T> action)

           {

               IterateIndex(items, action, 0);

           }

           public static void IterateIndex<T>(this IEnumerable<T> items, Action<int, T> action, int idx)

           {

               if (items == null)

                   throw new ArgumentNullException("items");

               if (action == null)

                   throw new ArgumentNullException("action");

               IEnumerator<T> enumerator = items.GetEnumerator();

               //Adjusting values

               for (int count = 0; count < idx; count++)

                   enumerator.MoveNext();

               while (enumerator.MoveNext())

               {

                   action(idx, enumerator.Current);

                   idx++;

               }

           }

    >> Function <<

           static string GetComma(IEnumerable<string> strings)

           {

               StringBuilder sb = new StringBuilder();

               sb.Append("{");

               var count = strings.Count();

               strings.IterateIndex<string>((i, s) =>

               {

                   if (count > 1 && i == count - 1)

                   {

                       sb.Append(" AND ");

                   }

                   else if (i > 0)

                   {

                       sb.Append(",");

                       sb.Append(" ");

                   }

                   sb.Append(s);

               });

               sb.Append("}");

               return sb.ToString();

           }

    >>Execute<<

               var s1 = new List<string>();

               var s2 = new List<string>() { "One" };

               var s3 = new List<string>() { "One", "Two" };

               var s4 = new List<string>() { "One", "Two", "Three", "Four" };

               Console.WriteLine(GetComma(s1));

               Console.WriteLine(GetComma(s2));

               Console.WriteLine(GetComma(s3));

               Console.WriteLine(GetComma(s4));

  • Here's mine. It's not particularly brilliant or elegant, but at least it's short and easy to understand ;)

    The main drawback of this solution is that the strings are actually enumerated twice (in ToArray then in Join), although there's no explicit loop

           private static string FormatList(IEnumerable<string> list)

           {

               string[] tab = list.ToArray();

               int n = tab.Length;

               string tmp;

               if (n > 1)

               {

                   tmp = String.Join(", ", tab, 0, n - 1);

                   tmp += " and " + tab[n - 1];

               }

               else

               {

                   tmp = String.Join(" and ", tab);

               }

               return "{" + tmp + "}";

           }

  • Oh great, Jon Skeet posted! Of course I'm just kidding. I'm sure some of my ideas have already been posted but that's ok...

    Here's the meat and potatoes:

    public static string FormatAsString(this IEnumerable<string> Sequence, string ItemSeparator, string LastItemSeparator)
    {
       var formattedString = new StringBuilder();
       string prev = null;
       foreach (string s in Sequence)
       {
           // the trick is to save the last item in the sequence for use outside of this loop
           if (prev != null)
           {
               formattedString.Append(prev);
               formattedString.Append(ItemSeparator);
           }
           prev = s;
       }
       if (prev != null)
       {
           if (formattedString.Length > 0)
           {
               formattedString.Append(LastItemSeparator);
           }
           formattedString.Append(prev);
       }
       formattedString.Append("}");
       return "{" + formattedString.ToString();
    }

    Which can be called like this:

    string GetSequenceAsFormattedString(IEnumerable<string> Sequence)
    {
       if (Sequence == null)            
       {
           return string.Empty;
       }
       return Sequence.FormatAsString(", ", "and ");
    }

  • static string CommaSeparatedString(IEnumerable<string> input)
    {
      string first = "", last = null;
      var rest = new StringBuilder();
      IEnumerator<string> en = input.GetEnumerator();
      if (en.MoveNext())
        first = en.Current;
      if (en.MoveNext())
        last = en.Current;
      while (en.MoveNext())
      {
        rest.Append(", ").Append(last);
        last = en.Current;
      }
      if (last != null)
      rest.Append(" and ").Append(last);
      return "{" + first + rest + "}";
    }

  • I don't particularly like this solution from the perspective of the maintainer, but off the top of my head it's hard to find a much more readable solution.  I look forward to seeing Eric's suggestion for how to do this.

    [STAThread]

    static void Main() {

       Console.WriteLine(GenerateSet(new string[] { }));

       Console.WriteLine(GenerateSet(new string[] { "ABC" }));

       Console.WriteLine(GenerateSet(new string[] { "ABC", "DEF" }));

       Console.WriteLine(GenerateSet(new string[] { "ABC", "DEF", "G", "H" }));

       Console.WriteLine(GenerateSet(new string[] { "Sample and other stuff", "Hello, World!", "XYZ", "42", "", "This is a test", "Last" }));

    }

    static private string GenerateSet(IEnumerable<string> items) {

       int itemsAdded = 0;

       StringBuilder sb = new StringBuilder();

       sb.Append("{");

       string current = null;

       foreach(string item in items) {

           if(current != null) {

               if(itemsAdded != 0) {

                   sb.Append(", ");

               }

               sb.Append(current);

               ++itemsAdded;

           }

           current = item;

       }

       if(current != null) {

           if(itemsAdded != 0) {

               sb.Append(" and ");

           }

           sb.Append(current);

       }

       sb.Append("}");

       return sb.ToString();

    }

  • Here's my solution, that I tried to keep as short as possible. I only have access to .NET 2.0 here so nothing fancy. You'll notice that this is pretty similar to Olivier's solution up above. I agree entirely that readability and terseness go hand in hand. I didn't want to mess with flags or anything else that was not directly related to the problem I was trying to solve.

           static string FancyConcat(IEnumerable<string> strEnumerable) {

               List<string> strList = new List<string>(strEnumerable);

               StringBuilder sb = new StringBuilder("{");

               if (strList.Count > 0)

               {

                   sb.Append(string.Join(", ", strList.GetRange(0, strList.Count - 1).ToArray()));

                   if (strList.Count > 1)

                       sb.Append(" and ");

                   sb.Append(strList[strList.Count - 1]);

               }

               sb.Append("}");

               return sb.ToString();

           }

  •        static string Quibble(IEnumerable<string> strings)

           {

               string toReturn = "{";

               int count = strings.Count(str => str != null);

               for (int i = 0; i < count; i++)

               {

                   if (i == count-1)

                   {

                       toReturn += strings.ElementAt(i);

                   }

                   else

                   {

                       string delim = ", ";

                       if (i == (count-2))

                       {

                           delim = " and ";

                       }

                       toReturn += (strings.ElementAt(i) + delim);

                   }

               }

               toReturn += "}";

               return toReturn;

           }

  • Here's my .02

    static void Main(string[] args)

    {

       Console.WriteLine(list(null));

       Console.WriteLine(list(new[] { "ABC" }));

       Console.WriteLine(list(new[] { "ABC", "DEF" }));

       Console.WriteLine(list(new[] { "ABC", "DEF", "G", "H" }));

       Console.WriteLine("Press any key");

       Console.ReadLine();

    }

    public static string list(IEnumerable<string> words)

    {

       string escape = "\n";

       string delim = ", ";

       string finalDelim = " and ";

       if (words == null)

           return "{}";

       string sentence = "{" + string.Join(escape, words.ToArray()) + "}";

       int lastCommaPosition = sentence.LastIndexOf(escape);

       if (lastCommaPosition > -1 && (sentence.IndexOf(escape) < lastCommaPosition))

       {

           string remainder = sentence.Substring(lastCommaPosition + 1, sentence.Length - (lastCommaPosition + 1));

           sentence = sentence.Replace(sentence.Substring(lastCommaPosition, sentence.Length - lastCommaPosition), finalDelim + remainder);

       }

       sentence = sentence.Replace(escape, delim);

       return sentence;

    }

  • Here's my solution:

    module StringCombiner =

     let CombineStrings (i: 'a list) =

       // F# gives us easy ways to look

       // at the head of a list plus everything

       // else.  In this case, we want the

       // last item plus all the items before

       // it, so just reverse the list

       match List.rev i with

       | [] -> "{}"  // No items, then just return curly braces

       | h :: [] -> sprintf "{%s}" (h.ToString())  // One item, return the item in curly braces

       | lastItem :: firstItemsReversed ->

         let firstItemsAsStrings = List.map (fun f -> f.ToString()) (List.rev firstItemsReversed)

         let firstItemsJoinedWithCommas = String.concat ", " firstItemsAsStrings

         sprintf "{%s and %s}" firstItemsJoinedWithCommas (lastItem.ToString())

     let CombineStringsGivenIterator (i: 'a seq) =

       CombineStrings (Seq.to_list i)

    It's a pretty standard list comprehension problem.

    From C#, this looks like (according to Reflector):

    [CompilationMapping(SourceConstructFlags.Module)]

    public static class StringCombiner

    {

       // Methods

       static StringCombiner();

       public static string CombineStrings<A>(List<A> i);

       public static string CombineStringsFromIterator<A>(IEnumerable<A> i);

       // Nested Types

       [Serializable]

       internal class clo@399<A> : FastFunc<A, string>

       {

           // Methods

           public clo@399();

           public override string Invoke(A f);

       }

    }

  • I was going post my solution, but I realized mdefalco pretty much did it already.  My only correction is that I would do a var arr = words.ToArray() and work off arr instead of words.  The reason is that many IQueryable<string> come from LINQ2SQL data sources and can only be enumerated once, which has bit me in the rear too many times.  As such, the ToArray would allow exactly one enumeration over the IEnumerable.  ToList() would also do the job.

  • public static IEnumerable<string> GetStrings()
           {
               yield return "ABC";
               yield return "CDE";
               yield return "G";
               yield return "H";
           }

    Here is the code to generate the output...

    var x = GetStrings().Reverse().Skip(1).Reverse().ToArray();
    var z = String.Format("{0} {1} and {2} {3}", "{", String.Join(", ", x), 
                                                             GetStrings().Last(), "}");
    Console.WriteLine(z);

    What if the sequence only has one item? -- Eric

     

Page 2 of 19 (277 items) 12345»