Comma Quibbling

Comma Quibbling

Rate This

[UPDATE: Holy goodness. Apparently this was a more popular pasttime than I anticipated. There's like a hundred solutions in there. Who knew there were that many ways to stick commas in a string? It will take me some time to go through them all, so don't be surprised if it's a couple of weeks until I get them all sorted out.]

Comma The point of Monday’s post about comma-separated lists was not so much about the actual problem; it’s a rather trivial problem. Rather, I wanted to make two points. First, stating the actual problem rather than a much harder and more general version of the problem is likely to get you a realistic solution to your actual problem much faster. And second, reworking the statement of the problem into an equivalent but structurally different statement is a great way to see solutions that you might have otherwise missed.

But whenever I make a post illustrating such points with a specific example, lots of people pipe up with their ideas for how to solve the specific example. Which is awesome; I encourage this behaviour.

So in that spirit, here’s a slightly harder version of the string concatenation problem, just for the fun of it. Write me a function that takes a non-null IEnumerable<string> and returns a string with the following characteristics:

(1) If the sequence is empty then the resulting string is "{}".
(2) If the sequence is a single item "ABC" then the resulting string is "{ABC}".
(3) If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "{ABC and DEF}".
(4) If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)

I think you get the idea. You can post your solution in the comments or use the link on the blog page to email your solution to me.

The strings in the sequence can be assumed to be non-null but can otherwise be any string value, including empty strings or strings containing commas, braces and "and".

There’s no size limit on the sequence; it could be tiny, it could be thousands of strings. But it will be finite.

All you get are the methods of IEnumerable<string>; if you want to make that thing into a list or an array, you’re going to need to do that explicitly rather than casting it and hoping for the best.

I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.

Of course, C# is most interesting to me, but if there are neat ways to express this in other languages, I’d love to see them too.

If there are any particularly amusing or interesting implementations I’ll dissect them on the blog in a future episode, probably in a week or so. I’m not going to have time to do a detailed analysis of every one.

And… go!

  • What should be returned for new[] { "", ",", "}" }?

    "{, , and }}"?

    "{"", "," and "}"}"?

    Something else?

    The former seems reasonable. Basically, I don't expect you to parse the inputs. -- Eric

  • This is what I came up with.

    private string buildCommaSeperatedList(IEnumerable<string> Items)
       {
           //Create List from IEnumerable
           List<string> ListItems = new List<string>();
           foreach (string item in Items)
           {
               ListItems.Add(item);
           }
           //Instantiate Return Value
           StringBuilder CommaSeperatedList = new StringBuilder();
           //Add Ending Bracket
           CommaSeperatedList.Append("}");
           //Get the index of the last item in the list
           int LastIndex = ListItems.Count - 1;
           //Loop through the list in reverse order.
           for (int i = LastIndex; i >= 0; i--)
           {
               //Add " and " between the last two items.
               if (i == LastIndex - 1) CommaSeperatedList.Insert(0, " and ");
               //Add commas for all items before the second to last item.
               if (i < LastIndex - 1) CommaSeperatedList.Insert(0, ", ");
               //Add List Item
               CommaSeperatedList.Insert(0, ListItems[i]);
           }

           //Add beginning bracket
           CommaSeperatedList.Insert(0, "{");
           //Return Comma Seperated List as a string
           return CommaSeperatedList.ToString();
       }

    Care to take a stab at what the asymptotic order of this algorithm is? -- Eric

  • Great question. Hopefully this will format reasonably... if not I'll mail it to Eric. The restatement of the problem is in the XML comments for the method.

    using System;
    using System.Collections.Generic;
    using System.Text;
    class CommaTeaser
    {
       static void Main()
       {
           Test();
           Test("ABC");
           Test("ABC", "DEF");
           Test("ABC", "DEF", "G", "H");
       }
       static void Test(params string[] words)
       {
           Console.WriteLine(JoinWords(words));
       }
       /// <summary>
       /// Joins words as per Eric's post.
       /// </summary>
       /// <remarks>
       /// Restating the problem:
       /// 1) We always start with "{" and end with "}"
       /// 2) The last word has no suffix
       /// 3) The penultimate word has a suffix of " and "
       /// 4) All other words have a suffix of ", "
       /// Now to work out the last and penultimate words, we just
       /// have to keep a "buffer" of the last two words we've
       /// seen.
       /// </remarks>

       static string JoinWords(IEnumerable<string> words)
       {
           StringBuilder builder = new StringBuilder("{");
           string last = null;
           string penultimate = null;
           foreach (string word in words)
           {
               // Shuffle existing words down
               if (penultimate != null)
               {
                   builder.Append(penultimate);
                   builder.Append(", ");
               }
               penultimate = last;
               last = word;
           }
           if (penultimate != null)
           {
               builder.Append(penultimate);
               builder.Append(" and ");
           }
           if (last != null)
           {
               builder.Append(last);
           }
           builder.Append("}");
           return builder.ToString();
       }
    }

  • This is a little less dumb.

    private string buildCommaSeperatedList(IEnumerable<string> Items)
       {
           //Create List from IEnumerable
           List<string> ListItems = new List<string>();
           foreach (string item in Items)
           {
               ListItems.Add(item);
           }
           //Instantiate Return Value
           StringBuilder CommaSeperatedList = new StringBuilder();
           CommaSeperatedList.Append("{");//Add Ending Bracket
           //Loop through the list.
           for (int i = 0; i <= ListItems.Count - 1; i++)
           {
               //Add List Item
               CommaSeperatedList.Append(ListItems[i]);
               //Add " and " between the last two items.
               if (i == ListItems.Count - 2) CommaSeperatedList.Append(" and ");
               //Add commas for all items before the second to last item.
               if (i < ListItems.Count - 2) CommaSeperatedList.Append(", ");
           }
           CommaSeperatedList.Append("}");//Add ending bracket
           //Return Comma Seperated List as a string
           return CommaSeperatedList.ToString();
       }

  • Here's mine that does a single pass without foreach.  It's a little like Jon's, rolled up

           public string CommaList(IEnumerable<string> e)
           {
               using (var en = e.GetEnumerator())
               {
                   // Handle 0 or 1 words
                   var word = en.MoveNext() ? en.Current : "";
                   if (!en.MoveNext())
                       return "{" + word + "}";
                   // Handle 2 or more words
                   var sb = new StringBuilder(word);
                   do
                   {
                       word = en.Current;
                       if (!en.MoveNext()) // This was the last word
                           return "{" + sb.ToString() + " and " + word + "}";
                       sb.Append(", ");
                       sb.Append(word);
                   } while (true);
               }
           }

  • err, I mean, it's a little like Jon's rolled up, but my two buckets are "word" and "en.Current"

  • @Jon Skeet - I really like the way your "buffer" works.  I wish I would have thought of that.  I wouldn't have needed to convert the IEnumerable<string> to a list<string>

  • You can use the StringBuilder trick you described in the last post. Build the list as before, track the last comma and replace it with " and"

    Something along these lines (haven't tried to compile):

    string Join(IEnumerable<string> data) {
     StringBuilder sb = new StringBuilder("{");
     bool isFirst = true;
     int off;
     foreach(string s in data) {
       off = sb.Length;
       if(!isFirst) {
         sb.Append(", ");
       } else {
         isFirst = false;
       }
       sb.Append(s);
     }
     if(off > 0) {
        sb.Remove(off, 1);
        sb.Insert(" and");
     }
     sb.Append("}");
    }

  • public static string CommaQuibbling(this IEnumerable<string> items) {
       using (IEnumerator<string> enumtor = items.GetEnumerator())
           return (new StringBuilder()).AppendBracketed(enumtor).ToString();
    }

    static StringBuilder AppendBracketed(this StringBuilder builder, Enumerator<string> enumtor) {
       return builder.Append('{').AppendNoneOrMore(enumtor).Append('}');
    }

    static StringBuilder AppendNoneOrMore(this StringBuilder builder, IEnumerator<string> enumtor) {
       return enumtor.MoveNext()
           ? builder.AppendOneOrMore(enumtor.Current, enumtor)
           : builder;
    }

    static StringBuilder AppendOneOrMore(this StringBuilder builder, string first, IEnumerator<string> enumtor) {
       return enumtor.MoveNext()
           ? builder.AppendTwoOrMore(first, enumtor.Current, enumtor)
           : builder.Append(first);
    }

    static StringBuilder AppendTwoOrMore(this StringBuilder builder, string first, string second, IEnumerator<string> enumtor) {
       return enumtor.MoveNext()
           ? builder.Append(first).Append(", ").AppendTwoOrMore(second, enumtor.Current, enumtor)
           : builder.Append(first).Append(" and ").Append(second);
    }

    Slick. But at what size list does this cause your program to crash with an out-of-stack exception? -- Eric

     

  • Well, I was going to post my solution.  But then I saw that Jon Skeet had already done so.

    For the record, I prefer that approach, because it's doable without getting into the enumerator itself.  I find that more elegant than explicitly accessing the enumerator methods.

  • This would be my version of a function that fulfills the requirement efficiently while being pretty clear in how it works

       class Program
       {
           static void Main(string[] args)
           {
               Console.WriteLine(BuildStringWithEnglishListSyntax(new string[] { }));
               Console.WriteLine(BuildStringWithEnglishListSyntax(new string[] { "ABC" }));
               Console.WriteLine(BuildStringWithEnglishListSyntax(new string[] { "ABC", "DEF" }));
               Console.WriteLine(BuildStringWithEnglishListSyntax(new string[] { "ABC", "DEF", "G", "H" }));
               Console.WriteLine(BuildStringWithEnglishListSyntax(new string[] { "", ",", "}" }));
               Console.ReadKey(true);
           }
           const string EnglishListPrefix = "{";
           const string EnglishListSuffix = "}";
           const string IntermediateSeparator = ", ";
           const string LastSeparator = " and ";
           static string BuildStringWithEnglishListSyntax(IEnumerable<string> itemsEnumerable)
           {
               StringBuilder result = new StringBuilder(EnglishListPrefix);
               using(IEnumerator<string> items = itemsEnumerable.GetEnumerator())
               {
                   if(items.MoveNext())  // make sure it's not an empty list
                   {
                       bool isFirstString = true;
                       bool isLastString = false;
                       while(!isLastString)
                       {
                           string current = items.Current;
                           isLastString = !items.MoveNext();
                           if(!isFirstString)
                           {
                               result.Append(isLastString ? LastSeparator : IntermediateSeparator);
                           }
                           else
                           {
                               isFirstString = false;
                           }
                           result.Append(current);
                       }
                   }
               }
               result.Append(EnglishListSuffix);
               return result.ToString();
           }
       }

  • > I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.

    Readability is terseness.

    static string JoinStrings(IEnumerable<string> strings)

    {

       var list = strings.ToList();

       if (list.Count == 0)

           return "{}";

       if (list.Count == 1)

           return "{" + list.First() + "}";

       return "{" + string.Join(", ", list.GetRange(0, list.Count - 1).ToArray()) + " and " + list.Last() + "}";

    }

  • I didn't include comments, but this works with lazy evaluation of the enumerable.

           public static string FormatString(IEnumerable<string> stringsToFormat)

           {

               const string Separator = ", ";

               StringBuilder builder = new StringBuilder();

               builder.Append('{');

               string lastItem = string.Empty;

               foreach (string item in stringsToFormat)

               {

                   builder.Append(item + Separator);

                   lastItem = item;

               }

               int lengthOfReplacedString = (lastItem.Length + (Separator.Length * 2));

               if (lengthOfReplacedString < builder.Length)

               {

                   builder.Replace(

                       Separator + lastItem + Separator,

                       " and " + lastItem,

                       builder.Length - lengthOfReplacedString,

                       lengthOfReplacedString);

               }

               else

               {

                   builder.Replace(Separator, string.Empty);

               }

               builder.Append('}');

               return builder.ToString();

           }

  • The first reasonable idea I came up with was to create a list of KeyValuePairs, where the Value is the separator that goes before the value.

    I note that you have redefined the meaning of an existing class rather than defining your own. I personally would use KeyValuePair only for storing pairs of keys and their related values, so that the code is clear to the reader. If you want an "item and separator pair" class, I'd either use a general-purpose tuple, or define an "ItemAndSeparatorPair" class. That said, nice solution. -- Eric

           public static string ConcatenateSequence(IEnumerable<string> stringSequence)
           {
               var strings = stringSequence.ToList().ConvertAll(input => new KeyValuePair<string, string>(input, ", "));
               if (strings.Count > 1)
               {
                   var lastItem = strings[strings.Count - 1];
                   strings[strings.Count - 1] = new KeyValuePair<string, string> lastItem.Key, " and ");
               }
               var sequenceBuilder = new StringBuilder("{");
               bool isFirst = true;
               foreach (KeyValuePair<string, string> valueSeparatorPair in strings)
               {
                   if (!isFirst)
                   {
                       sequenceBuilder.Append(valueSeparatorPair.Value);
                   }
                   sequenceBuilder.Append(valueSeparatorPair.Key);
                   isFirst = false;
               }
               sequenceBuilder.Append("}");
               return sequenceBuilder.ToString();
           }

  • I agree with Olivier that readability is terseness. That solution is clear and takes very little code. More lines equals more bugs.

Page 1 of 19 (277 items) 12345»