Comma Quibbling

Comma Quibbling

Rate This

[UPDATE: Holy goodness. Apparently this was a more popular pasttime than I anticipated. There's like a hundred solutions in there. Who knew there were that many ways to stick commas in a string? It will take me some time to go through them all, so don't be surprised if it's a couple of weeks until I get them all sorted out.]

Comma The point of Monday’s post about comma-separated lists was not so much about the actual problem; it’s a rather trivial problem. Rather, I wanted to make two points. First, stating the actual problem rather than a much harder and more general version of the problem is likely to get you a realistic solution to your actual problem much faster. And second, reworking the statement of the problem into an equivalent but structurally different statement is a great way to see solutions that you might have otherwise missed.

But whenever I make a post illustrating such points with a specific example, lots of people pipe up with their ideas for how to solve the specific example. Which is awesome; I encourage this behaviour.

So in that spirit, here’s a slightly harder version of the string concatenation problem, just for the fun of it. Write me a function that takes a non-null IEnumerable<string> and returns a string with the following characteristics:

(1) If the sequence is empty then the resulting string is "{}".
(2) If the sequence is a single item "ABC" then the resulting string is "{ABC}".
(3) If the sequence is the two item sequence "ABC", "DEF" then the resulting string is "{ABC and DEF}".
(4) If the sequence has more than two items, say, "ABC", "DEF", "G", "H" then the resulting string is "{ABC, DEF, G and H}". (Note: no Oxford comma!)

I think you get the idea. You can post your solution in the comments or use the link on the blog page to email your solution to me.

The strings in the sequence can be assumed to be non-null but can otherwise be any string value, including empty strings or strings containing commas, braces and "and".

There’s no size limit on the sequence; it could be tiny, it could be thousands of strings. But it will be finite.

All you get are the methods of IEnumerable<string>; if you want to make that thing into a list or an array, you’re going to need to do that explicitly rather than casting it and hoping for the best.

I am particularly interested in solutions which make the semantics of the code very clear to the code maintainer.

Of course, C# is most interesting to me, but if there are neat ways to express this in other languages, I’d love to see them too.

If there are any particularly amusing or interesting implementations I’ll dissect them on the blog in a future episode, probably in a week or so. I’m not going to have time to do a detailed analysis of every one.

And… go!

  • This is my attempt at making the code quite explicit in what it's doing by using extension methods. It's not the most efficient of ways but I think it does say what it trying to do (arguably anyway).

    //the main method
    public string StringQuibble( IEnumerable<string> strings )
    {
       return "{" + strings.Concat( ", ").ReplaceLastDelimiter( ", ", " and " ) + "}";
    }
    ///the extension methods
    public static class StringQubbleExtensions
    {
       public static string Concat( this IEnumerable<string> strings, string delimiter )
       {
           StringBuilder sb = new StringBuilder();
           if( strings != null )
           {
               IEnumerator<string> stingsEnumerator = strings.GetEnumerator();
               if( stringsEnumerator.MoveNext() )
               {
                   sb.Append( stringsEnumerator.Current );
                   while( stringsEnumerator.MoveNext() )
                   {
                       sb.Append( delimiter );
                       sb.Append( stringsEnumerator.Current );
                   }
               }
           }
           return sb.ToString();
       }

       public static string ReplaceLastDelimiter( this string str, string delimiterToReplace, string delimiterReplacement )
       {
           if( str.IndexOf( delimiterToReplace ) > -1 )
           {
               StringBuilder sb = new StringBuilder( str );
               return sb.Replace( delimiterToReplace, delimiterReplacement, str.LastIndexOf( delimiterToReplace), delimiterToReplace.Length ).ToString();
           }
           return str;
       }
    }

  • Still terse:    (Sam... calling Count() is cheaper than creating a new list... Take, Skip and Concat use deferred execution so the only list building is done at the ToArray calls, the second of which is always two or less elements in size)

    static string CommaQuibble(IEnumerable<string> words)

    {

         int clip = words.Count() - 2;

         var head = words.Take(clip);

         var tail = words.Skip(clip);

         return string.Format("{{{0}}}", string.Join(", ", head.Concat(new[] { string.Join(" and ", tail.ToArray()) }).ToArray()));

    }

  • public static string FormatList(IEnumerable<string> source)
    {
       string last = source.DefaultIfEmpty().Last();
       return "{" +
           source.DefaultIfEmpty()
           .Aggregate(new StringBuilder(),
           (acc, next) => acc.AppendFormat(", {0}", next),
           acc => Regex.Replace(acc.ToString().Substring(2), ", " + last + "$", " and "  last))
           + "}";
    }

  • @Chris Benard: what if the last string contains " ,"?

  • Fair enouch: Count() is cheaper than creating a new collection if the object implementing IEnumerable also implements ICollection. That said, I looked at this problem under the assumption that we're dealing with Foo : IEnumerable<string> and nothing else.

  • Revised version which only iterates the source once:

    public static string FormatList(IEnumerable<string> source)

    {

       string last = string.Empty;

       return "{" +

           source.DefaultIfEmpty()

           .Aggregate(new StringBuilder(),

           (acc, next) => acc.AppendFormat(", {0}", last = next),

           acc => Regex.Replace(acc.ToString().Substring(2), ", " + last + "$", " and " + last))

           + "}";

    }

  • Hideously obfuscated solution:

    public static string FormatList(IEnumerable<string> source)

    {

       var array = source.ToArray();

       return "{" + ((array.Length == 0) ? string.Empty : (array.Length == 1) ? array[0] :

           string.Join(", ", array, 0, array.Length - 1) + " and " + array[array.Length - 1]) + "}";

    }

  • Eric, I like your use of Aggregate().

  • It is not so costly to materialize list of strings in this task. Resulting string will consume same amount of memory if average length of word will be two characters (on x86). So we can simply get list and look on indexies.

           public static string Join3(IEnumerable<string> strings)

           {

               var list = (strings as IList<string>) ?? strings.ToList();

               var result = new StringBuilder("{");

               for (int i = 0; i < list.Count; i++)

               {

                   result

                       .Append(i == 0

                                   ? ""

                                   : (i == list.Count - 1)

                                       ? " and "

                                       : ", ")

                       .Append(list[i]);

               }

               return result

                   .Append("}")

                   .ToString();

           }

    Or do little more optomization in case when source is ICollection.

           public static string Join4(IEnumerable<string> strings)

           {

               var list = (strings as ICollection<string>) ?? strings.ToList();

               var result = new StringBuilder("{");

               int i = 0;

               foreach (var item in list)

               {

                   result

                       .Append(i == 0

                                   ? ""

                                   : (i == list.Count - 1)

                                       ? " and "

                                       : ", ")

                       .Append(item);

                   i++;

               }

               return result

                   .Append("}")

                   .ToString();

           }

  • @Jon Skeet: A quick look found the following: Jacob, Skrud, Tim Jarvis, Pankaj Sharma. I got bored at that point and had a train to catch.

  • Sam Webb is right about Count().

    If you're going for an iterative solution, I see three decent approaches:

    The first is to manually enumerate (and only once), with some tricks to special case the first and last element, or the last two elements (depending on whether you consider the seperators as a prefix or as a suffix to the elements). Jon Skeet did this well, but my favorite in this category must be John Spong's solution, for his elegant improvement over using some buffer variables by using a queue.

    The second approach is to immediately convert to a list and then indexing the items in a straightforward way. (String.Join is also a good option here.) My favorite here is by Nick (at April 15, 2009 6:11 PM)

    The third approach is to disregard the number of times you enumerate the list (as long as it's a constant; I don't like the idea of a solution with quadratic complexity, for a simple problem like this) and use some LINQ magic for getting something elegant. Olivier Leclant does this well.

  • @rbirkby: Of those, only Pankaj Sharma's solution seems to be obviously convertible to use string concatenation with no loss of either readability or performance. Yes, some of those others could be converted into one very complicated string concatenation, possibly including conditional expressions - but I don't think that's actually a good idea.

  • To make it clean for all who uses Count(). Lets assume this simple test:

           static void Main(string[] args)

           {

               string result;

               var start = DateTime.Now;

               {

                   result = Join(GetStrings(10));

               }

               var end = DateTime.Now;

               Console.WriteLine(@"It takes {0}s to get ""{1}"" result", (end-start).TotalSeconds, result);

               Console.WriteLine("Press enter...");

               Console.ReadLine();

           }

           public static IEnumerable<string> GetStrings(int n)

           {

               for (int i = 0; i < n; i++)

               {

                   // Do some hard work here...

                   System.Threading.Thread.Sleep(500);

                   yield return ((char)('a' + i)).ToString();

               }

           }  

    There source does not implement anything except IEnumerable and enumeration is very costly. So usualy it effectively to get list first of all and then manipulate on it rather than call Count() which internally performs enumeration and then enumerate twice.

  • Maybe I misunderstood something but my function in python is only 1-line long (there are no checks for types and so on) with full functionality

    >>> def f(input): return "{"+" and ".join([", ".join(input[:-1]),input[-1]])+"}"

    ...

    >>> print f(["ABC", "DEF", "G", "H"])

    {ABC, DEF, G and H}

  • Here is my proposal:

    **********************************************

    using System;

    using System.Collections.Generic;

    using System.ComponentModel;

    using System.Data;

    using System.Drawing;

    using System.Linq;

    using System.Text;

    using System.Windows.Forms;

    namespace LippertChallange0415

    {

       public partial class Form1 : Form

       {

           public Form1()

           {

               InitializeComponent();

               this.Load += new EventHandler(Form_Load);

           }

           public void Form_Load(object sender, EventArgs e)

           {

               StringBuilder collector = new StringBuilder();

               collector.Append(GetCommaQuibbledString(new string[] {}));

               collector.Append("\n");

               collector.Append(GetCommaQuibbledString(new string[] {null, "NICK", "IS", "GOOD LOOKING", "COOL"}));

               collector.Append("\n");

               collector.Append(GetCommaQuibbledString(new string[] {"PEANUT BUTTER", "JELLY" }));

               collector.Append("\n");

               collector.Append(GetCommaQuibbledString(new string[] {"1", "2", "3", "4", "5", "6", "7", "8", "9", "10" }));

               collector.Append("\n");

               MessageBox.Show(collector.ToString());

           }

           public string GetCommaQuibbledString(IEnumerable<string> input)

           {

               StringBuilder rtrn = new StringBuilder();

               rtrn.Append("{");

               if (input != null)

               {

                   IEnumerator<string> strings = input.GetEnumerator();

                   if (strings.MoveNext())

                   {

                       string current = null;

                       string next = strings.Current ?? string.Empty;

                       int addedCount = 0;

                       // keep going as long as we haven't processed the last item

                       while (next != null)

                       {

                           // replace current with next, get the next item

                           if (strings.MoveNext())

                           {

                               current = next;

                               next = strings.Current ?? string.Empty;

                               // add the appropriate separator (if any)

                               if (addedCount > 0)

                                   rtrn.Append(", ");

                           }

                           else

                           {

                               current = next;

                               next = null;

                               // add the appropriate separator (if any)

                               if (addedCount > 0)

                                   rtrn.Append(" and ");

                           }

                           // add the current item

                           rtrn.Append(current);

                           addedCount++;

                       }

                   }

               }

               rtrn.Append("}");

               return rtrn.ToString();

           }

       }

    }

Page 9 of 19 (277 items) «7891011»