Building Tuple [Matt Ellis]

Building Tuple [Matt Ellis]

  • Comments 28

MSDN MagazineFor readers who are interested in the work that goes into designing a feature, I wrote an article for MSDN Magazine that appears in this month’s issue.  Check out CLR Inside Out: Building Tuple which introduces the new Tuple type as well as discusses the design work we did behind it.

I’d love to hear feedback on what you think about the article and if you’d like to see more behind the scenes design articles in the future.  I’d also love to answer any questions about the design or why we made the decisions we did.

Also, there is one change we are thinking about making between Beta 1 and Beta 2 around tuple.  In Beta 1 we have a factory method, Tuple.Create, which builds tuples and has some nice type inference properties.  In Beta 1 the overload of this method which takes eight arguments requires that the last element is a tuple and builds an extended tuple.  For example:

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8));

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8].

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8, 9));

Will build a nine element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8, 9]

For Beta 2 we hope to change this so that eight argument version of Tuple.Create always builds an eight element tuple.  In this case:

Tuple.Create(1, 2, 3, 4, 5, 6, 7, 8);

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, 8]

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8));

Will build an eight element tuple that looks like [1, 2, 3, 4, 5, 6, 7, [8]].

Tuple.Create(1, 2, 3, 4, 5, 6, 7, Tuple.Create(8, 9));

Will build an element tuple that looks like [1, 2, 3, 4, 5, 6, 7, [8, 9]]

If you want to build tuples with more than eight elements and your language doesn’t have special tuple syntax, you’ll have to use the Tuple constructors directly.  If we see lots of people doing this we’ll add more overloads to Tuple.Create.

Thanks for reading; I hope everyone enjoys the article and look forward to your questions and comments.  Cheers!

  • Yes! I liked this article, and love these types of articles. Understanding why something the is the way it is really helps in many cases. I'm not too thrilled about having 2- and 3-tuples be reference types, but if the F# team said it didn't bother them, I guess that's good enough for now...

    On the C# side, it's a bit conflicting how we use them now, since we count on them being reference types for memory impact. But I guess we'll just make specific structs or a StructTuple for those cases when needed.

    What I didn't understand is why tuples don't act like structs as far as comparison goes.

    (Tuple.Create(1, 2) != Tuple.Create(1, 2) // this is totally crazy.

    Really, that needs to be looked at again. It seems like a source of bugs and confusion for == not to work, but .Equals to work. It severely limits how we'd want to use Tuples in C# in general.

    Thanks!

  • This is a good change you made. From a usability perspective the old behavior is really confusing.

    It's actually a pity that we need this factory method. It would be much nicer when C# and VB would support inference on constructors, but I understand the trouble we’ll get into with such a language feature.

  • I'd really like to build tuples like this:

    var tuple2 = new { "element1", "element2" };

    var e1 = tuple.Item1;

    var e2 = tuple.Item2;

    The factory API could then be called behind the scenes.

  • In my opinion, the Beta 2 version of Tuple.Create() is better, since the outcome of the first one would really make me dumbstruck when first time using it.

    I am also really glad that you rejected the idea of naming the tuple properties with English numeral names - there are a lot of non-English folks out there working with .NET and expecting them to really understand the numeral word permutations is a risky bet.

    Why not use an indexed or a collection property instead of the individual ItemX properties?

  • I've read the article.

    I still don't like the idea of Tuple being a reference type. It means that there's one more type for which I will have to do recurrent pointless null checks for no good reason. In my opinion, Tuple should be a canonical example of a type that is absolutely, clearly a value type and nothing else.

    In any case, overloading == and != for Tuple is a must regardless of whether it's value or reference type. It's a general rule of thumb when overriding Object.Equals (don't C# compiler warns you about this?), it is a clear indication to the user that type has value semantics, and other BCL and FCL reference-but-really-value types do it (e.g. System.Uri, or System.Xml.Linq.XName).

  • I was very surprised to read in the article that Tuple will be a reference type. The article focused on the performance aspects, and concluded that there was no significant performance loss to making it a reference type. My question is, why would you want it to be a reference type in the first place? As the Framework Design Guidelines point out, and MS DevDiv members such as Eric Lippert have blogged about repeatedly, the choice between value type and reference type should be about semantics, not implementation, since implementations change. Surely Tuple fits the semantics of a value type better than those of a reference type? MichaelGG's example illustrates that perfectly. I am forced to ask: if Tuple is not a value type, what is? Why do value types exist at all?

    "...we were unable to find compelling reasons for Tuple to implement interfaces like IEquatable<T> and IComparable<T>, even though it overrides Equals and implements IComparable."

    I would think that the compelling reason is the one that was just stated: the type already implements IComparable, so for consistency it should implement IComparable<T>. Again, semantics and usability should be the primary design goal, not performance. Implementing the generic interface should not be rejected unless it can be proven that doing so will cause hard performance goals to not be met. This is what Microsoft designers have been preaching for years. The article gives the impression that the decision to not implement the generic interfaces was made out of fear, not after testing the actual implementation against pre-defined metrics. Perhaps the article gave the wrong impression?

  • @Alex O

    The reason we didn't use an indexed or collection property is because the only sensable return type for that would be Object, so you'd lose the nice type safe properties of Tuple when you pulled items back out.

  • @MichaelGG,

    Regarding the Value vs. Reference type decision, as I pointed out in the article, we did consider a split design where two and three element tuples would be value types, but the rest would be reference types, but there was strong pushback from the language teams about that due to the confusing semantic issues.

    @MichaelGG, @pminaev,

    With respect to overloading == and !=, There's a comment in the design guidelines that addresses this:

    Section 8.10.2 which deals with Equality operators on Reference Types.  In my book copy it says to Consider not overloading equality operators on reference types, even if you override Equals or implement IEquatable<T> and avoid doing overloading the operators if the implementation would be significantly slower than that of reference equality.

    On the the relevent MSDN Page[1] it says: "Most languages do provide a default implementation of the equality operator (==) for reference types. Therefore, you should use care when implementing the equality operator (==) on reference types. Most reference types, even those that implement the Equals method, should not override the equality operator (==)."

    Now perhaps it makes sense to break this guideline if we want to the type to feel more like a value type.  I'll discuss this issue with the team and see if we want to make that change.

    [1]: http://msdn.microsoft.com/en-us/library/7h9bszxx.aspx

  • Re: Null -- yea, it sucks. But null sucks in general, and it's something we just have to deal with. I don't care as much since I'm mostly using F#, so a lot of these issues don't bother me.

    I agree that having a split where 2- and 3-tuples are structs, and larger are reference types would probably be more hassle and confusion than its worth.

    Don Syme wrote a bit about why F# tuples are reference types. Part of the explanation was that passing around large tuples could be costly when making function calls.

    (Supposedly in the future, F# will have a quick way to specify value-type records/tuples.)

    Honestly, I think it'd just be fantastic if the JIT knew about tuples and could selectively do smart things with them (like the F# compiler appears to always decompose arguments that are tuples).

    From a C# perspective, the non-null and equality semantics seem really, really wierd. I'm unaware of anyone who thinks of a tuple as a reference type semantically -- it's a sequence of values! Please, please reconsider this part of the design. That design guideline page also says:

    "Consider implementing operator overloading for the equality (==), not equal (!=), less than (<), and greater than (>) operators when you implement IComparable."

    As well as:

    "Override the equality operator (==) if your type is a base type such as a Point, String, BigNumber, and so on" -- I'd say Tuple is about as basic a type you get...

    I also didn't really understand why the only non-generic interfaces would be implemented.

    Anyways, thanks for discussing this and publishing details!

  • My feedback: I had to check that a) I didn't actually work for Microsoft, b) I hadn't designed and built a Tuple class and c) I hadn't written an article for msdn.

    It took a moment, but I got there.

    Matt "not that one" Ellis

  • I understand the reasoning behind making it a reference type now, thank you.

    The reason why Tuple should get an overloaded operator== regardless is simply because the default reference-eqiality version is nonsensical for a Tuple (as it is for any other immutable reference type that represents a value). There's absolutely no benefit in being able to determine that two Tuple variables reference the same instance - there's nothing useful you can derive from that information. I would again like to point out classes such as Uri, XName and XNamespace that do that correctly. If FDG does not cover this case, it is a fault in FDG.

    As a side note, anonymous classes in C# do not redefine operator== to mean structural equality, even though they do redefine Object.Equals. However, when I created a Connect ticket about it (https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=349014), Mads Torgersen replied that, in retrospect, they should indeed have made it so, even though they cannot change it now for back-compat reasons. Please don't fall into the same trap! ;)

  • Good article, I like this kind of in-depth article.

    I agree with the other commenters - Tuple should be a struct, and Tuple.Create(1, 42) == Tuple.Create(1, 42) should be true.

  • C# 3.0 added anonymous types that have a pretty straightforward way of defining class member names and use type inference, plus it does not have a built-in limitation on the number of members that can be specified.

    So, instead of creating a tuple as yet another language abstraction, why not just enhance the anonymous type mechanism by allowing the anonymous types to expose the required interfaces (e.g. IStructuralComparer etc.) and add ability to pass instances of such types around as strongly typed entities?

    Cheers,

    Alex

  • Alex, that (passing around anonymous types) would require structural type equivalence if you want to be able to do that seamlessly between assemblies. And CLR is very much centered around the notion of nominal typing, though NoPIA is a very restricted form of structural typing (which anonymous classes won't be able to reuse). So this would require a fairly major change to the CLR to implement.

  • "So this would require a fairly major change to the CLR to implement."

    Unless anonymous types in C# used Tuple underneath, which is essentially what F# already does, and would make a lot of sense. Unfortunately it would probably be a breaking change at this point.

Page 1 of 2 (28 items) 12