Set Collection [Ryan Byington]

Set Collection [Ryan Byington]

  • Comments 17

So the BCL thinking of implementing a set collection for a future release and we would like to get you our blog reading public opinion on what scenarios are important to you for a Set collection. The collection will likely have the standard ICollection<T> APIs like Add, Remove, Clear, and Count and we would likely support the following scenarios:

Test for membership (ie does element A exist in set T)

Test for subset (ie is set T a subset of set U)

Union

Intersection

Complement

 

But what other functions should a Set collection be able to perform?

  • I'd like a static property on it for null set, much like String.Empty - Set.NullSet or Set.Empty or some such notation.

    If there will be a way to check if Set A is a subset of Set B, there should also be a way to check if Set A is a superset of Set B - admittedly subset technically covers all the ways of checking for a superset (just reverse the operands), but having both methods available would be valuable.

    In the subset-checking operation, there should be a way to check if a subset is a proper subset, or an improper subset.
  • This would be a very useful addition.  Will there be an analog to STD C++'s multiset?

    It would also be useful to have an analog of multimap unless there is some way to get that behavior that we have missed?
  • Hmm.  I thought the whole point of the Power Collections project was to test out different types of collections.  These collections could then be implemented in the CLR if they proved "successful".

    If so, what's wrong with incorporating the Power Collections Set?
  • While building in the common base functionality is crucial, it will of course will impossible to cover everyone’s’ specific needs. Thus, I hope for an extendable design. There are a variety of ways to implement a set data structure, such as basic arrays, hashes, linked lists, trees, and others. Then there are ordered sets as well. And that’s just the start. I would like to see an ISet<T> interface to allow for customization.
  • Test for Overlap... Overlaps or IsDisjoint
  • I would suggest that you look at the Wintellect.PowerCollections.Set<T> class and provide that functionality along with an ISet<T> interface.  I have not yet found any methods that I need that Wintellect.PowerCollections.Set<T> does not provide.  

    But it would be really nice to have the class as part of the BCL.
  • +1 for just slurping in the PowerCollections.
  • One question is whether the set operations operate in-place, or always return a resultant set.  

    In our own experience, we find that the in-place set operations are usually the most efficient, while still remaining intuitive. We use static class methods to return the results of operations on two sets.  The only problem is that the static methods couldn't be defined in ISet<T>, but could be defined in Set<T> : ISet<T>.

    Also, we find it very useful to make the arguments to all set operations to be IEnumerable<T> instead of ISet<T>.  That way you can compose sets from sets, lists, collections, LINQ results, etc.  When performing the union of an ISet<T> with an IEnumerable<T>, it is not an exception for the IEnumerable<T> to contain multiple instances of the same T, since any instance of T is only ever added to the set once.

    eg:  class ISet<T> : ICollection<T>
        {

           void Union(IEnumerable<T> collection);
           void Intersection(IEnumerable<T> collection);
           void Complement(IEnumerable<T> collection);

        }

        class Set<T> : ISet<T>, ICollection
        {

    public static ISet<T> Union<IEnumerable<T> a, IEnumerable<T> b> { ... }

    public static ISet<T> Intersection<IEnumerable<T> a, IEnumerable<T> b> { ... }

    public static ISet<T> Complement<IEnumerable<T> a, IEnumerable<T> b> { ... }

        }

  • Oops!  "class ISet<T>" should obviously have been "interface ISet<T>".  Mea culpa!

    Would also second the suggestion of having an interface IOrderedSet<T> : IList<T>.
  • Difference (A-B = all the elements in A which are not in B).

    What about bags (i. e., duplicates allowed)?  If there are bags, then conversions to/from set/bag (bag to set could fail if the bag has duplicates).  Sequences (ordered bags) too.

    See also http://www.jfsowa.com/logic/math.htm
  • If you are going to have an interface and allow multiple implementations of sets, then this would be a good way to do it:

    interface ISet<T> : ICollection<T> {...}
    class HashSet : ISet<T> {...}
    class TreeSet : ISet<T> {...}
    class ListSet : ISet<T> {...}
    static class Set {...} //any static methods that apply to sets but can be implemnted efficiently in terms of ISet<T> should be put here (since interfaces cannot have static methods):

    Otherwise, a simple class Set<T> : ICollection<T> would suffice.
  • Find, FindAll, & ConvertAll - ala List<T>

    Love those new methods.  I go back and forth on whether ForEach would make sense on a Set in actual practice (most people think ordered indexing when they think of for loops)...
  • Difference, I agree with David, — it is a MUST. Or you may call it subtraction.
  • I'll toss in another vote for a Subtraction function.
  • From past experiences the code for actions on multiple sets is ugly. It would be nice to have something like the following for each of the ISet methods on the Set class:

    public static Set<T> Union<T>( params ISet<T>[] sets ) { ... }
Page 1 of 2 (17 items) 12