Curiouser and curiouser

Curiouser and curiouser

Rate This
  • Comments 24

Here's a pattern you see all the time in C#:

class Frob : IComparable<Frob>

At first glance you might ask yourself why this is not a "circular" definition; after all, you're not allowed to say "class Frob : Frob"(*). However, upon deeper reflection that makes perfect sense; a Frob is something that can be compared to another Frob. There's not actually a real circularity there.

This pattern can be genericized further:

class SortedList<T> where T : IComparable<T>

Again, it might seem a bit circular to say that T is constrained to something that is in terms of T, but actually this is just the same as before. T is constrained to be something that can be compared to T. Frob is a legal type argument for a SortedList because one Frob can be compared to another Frob.

But this really hurts my brain:

class Blah<T> where T : Blah<T>

That appears to be circular in (at least) two ways. Is this really legal?

Yes it is legal, and it does have some legitimate uses. I see this pattern rather a lot(**). However, I personally don't like it and I discourage its use.

This is a C# variation on what's called the Curiously Recurring Template Pattern in C++, and I will leave it to my betters to explain its uses in that language. Essentially the pattern in C# is an attempt to enforce the usage of the CRTP.

So, why would you want to do that, and why do I object to it?

One reason why people want to do this is to enforce a particular constraint in a type hierarchy. Suppose we have

abstract class Animal
{
    public virtual void MakeFriends(Animal animal);
}

But that means that a Cat can make friends with a Dog, and that would be a crisis of Biblical proportions! (***) What we want to say is

abstract class Animal
{
    public virtual void MakeFriends(THISTYPE animal);
}

so that when Cat overrides MakeFriends, it can only override it with Cat.

Now, that immediately presents a problem in that we've just violated the Liskov Substitution Principle. We can no longer call a method on a variable of the abstract base type and have any confidence that type safety is maintained. Variance on formal parameter types has to be contravariance, not covariance, for it to be typesafe. And moreover, we simply don't have that feature in the CLR type system.

But you can get close with the curious pattern:

abstract class Animal<T> where T : Animal<T>
{
    public virtual void MakeFriends(T animal);
}

class Cat : Animal<Cat>
{
  public override void MakeFriends(Cat cat) {}
}

and hey, we haven't violated the LSP and we have guaranteed that a Cat can only make friends with a Cat. Beautiful.

Wait a minute... did we really guarantee that?

class EvilDog : Animal<Cat>
{
  public override void MakeFriends(Cat cat) { }
}

We have not guaranteed that a Cat can only make friends with a Cat; an EvilDog can make friends with a Cat too. The constraint only enforces that the type argument to Animal be good; how you use the resulting valid type is entirely up to you. You can use it for a base type of something else if you wish.

So that's one good reason to avoid this pattern: because it doesn't actually enforce the constraint you think it does. Everyone has to play along and agree that they'll use the curiously recurring pattern the way it was intended to be used, rather than the evil dog way that it can be used.

The second reason to avoid this is simply because it bakes the noodle of anyone who reads the code. When I see List<Giraffe> I have a very clear idea of what the relationship is between the "List" part -- it means that there are going to be operations that add and remove things -- and the "Giraffe" part -- those operations are going to be on Giraffes. When I see "FuturesContract<T> where T : LegalPolicy" I understand that this type is intended to model a legal contract about a transaction in the future which has some particular controlling legal policy. But when I read "Blah<T> where T : Blah<T>" I have no intuitive idea of what the intended relationship is between Blah and any particular T. It seems like an abuse of a mechanism rather than the modeling of a concept from the program's "business domain".

All that said, in practice there are times when using this pattern really does pragmatically solve problems in ways that are hard to model otherwise in C#; it allows you to do a bit of an end-run around the fact that we don't have covariant return types on virtual methods, and other shortcomings of the type system. That it does so in a manner that does not, strictly speaking, enforce every constraint you might like is unfortunate, but in realistic code, usually not a problem that prevents shipping the product.

My advice is to think very hard before you implement this sort of curious pattern in C#; do the benefits to the customer really outweigh the costs associated with the mental burden you're placing on the code maintainers?


(*) Due to an unintentional omission, some past editions of the C# specification actually did not say that this was illegal! However, the compiler has always enforced it. In fact, the compiler has over-enforced it, sometimes accidentally catching non-cycles and marking them as cycles.

(**) Most frequently in emails asking "is this really legal?"

(***) Mass hysteria!

  • I think the curiously recurring generic pattern has one use you haven't covered, which I consider more legitimate than the examples in your blog: allowing statically-typed access to a derived type.

    I've only used CRGP once (so far), in this type:

     nitokitchensink.codeplex.com/.../62063

    which implements INotifyPropertyChanged on behalf of a derived class. It enables me to provide this method:

     protected void OnPropertyChanged<TProperty>(Expression<Func<TDerived, TProperty>> expression);

    in a way so that when the implementer of the derived class is writing:

     OnPropertyChanged(x => x.

    they get completion with IntelliSense.

  • At the cost of one down cast (disregarding the InvalidCastException if you guessed wrong) per call you can get the same information if not using CRGP.

  • As somebody who has been guilty of using this pattern before, I find it speaks more to the lack of a LISP-like macro feature or mixins in C# more than anything else. Its an attempt to do metaprogramming in a language that doesn't have a lot of support for it.

  • This is a poor man's self type annotations that C# does not have, but, say, Scala and Fortress do. As you've pointed out, it does not enforce that self type is correct, which also means that compiler does not know the correct type of "this" in class body, but it's better than nothing. First, it quite clearly documents that T is supposed to be self type, not just any type, and second, the compiler will know correct type of T in the body.

  • I've been using it in several cases, but typically only for code which is not directly exposed to "end-users" (e.g. private classes, or abstract classes used through derived classes).

    Usually I call the type parameter TSelf and the derived classes are often sealed classes. One such example can be found here: code.google.com/.../GrammarObject.cs

  • I have a similar problem in my Protocol Buffers (serialization framework) port. We have the concept of a "message" type (immutable) and a corresponding "builder" type (mutable). So we end up with a generic interface which is partly self-referential and partly mutually-referential to its corresponding one. Extracts of both:

     public interface IMessage<TMessage, TBuilder> : IMessage<TMessage>

         where TMessage : IMessage<TMessage, TBuilder>

         where TBuilder : IBuilder<TMessage, TBuilder>

     {

         TBuilder ToBuilder();

         // ... other stuff

     }

     public interface IBuilder<TMessage, TBuilder> : IBuilder

         where TMessage : IMessage<TMessage, TBuilder>

         where TBuilder : IBuilder<TMessage, TBuilder>

     {

         TMessage Build();

         // ... other stuff

     }

    I don't like it, but it generally makes the rest of the code cleaner. Fortunately *most* client code only cares about the concrete types (which are generated from schema files). I'm not sure whether I'd say that it's a pity that C# doesn't allow for a richer way of expressing generic type relationships, but there are certainly times where it introduces ugliness.

  • I'm hoping this is the start of a series of "hypothetical future version of C#" posts. :) Sometimes C# backs you into a corner, and this is the just the least terrible way to handle it.

  • I would like to be able to do this:

    abstract class Foo<T> : T where T : interface

    {

    }

  • >> This is a poor man's self type annotations that C# does not have, but, say, Scala and Fortress do.

    As Eric noted, a self-typed argument is inherently covariant, while LSP requires arguments to be contravariant. You actually need path-dependent types to sort this out, but they are tricky to get right themselves.

    Ironically, a type safety problem in Eiffel, which has non-path-dependent self-type ("other: like Current"), and allows their use in method arguments, is known as "catcall" in Eiffel parlance.

  • So why aren't there covariant return types on virtual methods? It seems like an obvious feature to have, which makes me think that it was considered, but hasn't met the feature bar for any release yet to date.

    Covariance on virtual method return types is a fairly frequently requested feature, and I'd use it if I had it. It's never made the bar because (1) the CLR does not support it; we'd have to either generate lots of helper code behind the scenes to make it work, or convince the CLR team to change (the C++/CLI team did the former) (2) in most cases you can generate the necessary helper functions yourself quite easily; just make a "new" method that has the right return type that delegates its implementation to the virtual method, and (3) Anders does not consider it to be a particularly important feature. - Eric

  • Heh, interesting scenario (Cat, Dog, MakeFriends). I suspect that the Best Tool(tm) for that job is typeclasses (e.g. as in Haskell). C# of course does not have type classes. I vaguely recall seeing an example somewhere where an author had simulated typeclasses in C#, it required manual work on the part of the programmer.

    In the end, for non-mission-critical code, I concur with your conclusion: don't enforce the constraint at compile time, but rather assert the type in the implementation of MakeFriends. If you need the certainty of compile time checking, consider a different programming language that is designed to provide it.

  • > You actually need path-dependent types to sort this out

    I think you're confusing path dependent types with F-bounded polymorphism (www.cs.cmu.edu/.../f-bounded.pdf, the proverbial Eiffel example is mentioned on the second page). And the thing is, C# almost has it. You can specify F[t] (e.g. IComparable<T>) and you can express condition t<:F[t] in where for functions (though a shorter syntax may be handy). The problems are

    1) you have to explicitly declare that t<:F[t] for your type t, but there's nothing stopping you from declaring t<:F[u]

    2) F bound is inherited, which also creates cases where t<:F[u] for t!=u

    So if there was a way to force t<:F[t] when implementing (which is what self type annotation does) and also, say, only allow implementing F[t] on sealed classes, we should be fine.

    @Eric, btw, in C++ any template that is supposed to receive derived type as one of the arguments is called CRTP. So IComparable<T> also qualifies. There's no equivalent to IComparable<T> where T:IComparable<T>, as C++ does not have a type system for templates.

  • HarHar!

    You can sprinkle in even more generics for a laugh...4 years ago this design happened unto me. I'd probably do it differently now, but it was a wonderful neuron twister back then...enjoy! http://realfiction.net/go/57

  • >> I think you're confusing path dependent types with F-bounded polymorphism  ... So if there was a way to force t<:F[t] when implementing (which is what self type annotation does) and also, say, only allow implementing F[t] on sealed classes, we should be fine.

    I was talking about the more general case, not "sealed classes only" (that's practically cheating! ~). That's where you need path-dependent types, for type system to be able to reason about whether the argument you're passing is of the same _dynamic_ type as the receiver.

  • > That's where you need path-dependent types, for type system to be able to reason about whether the argument you're passing is of the same _dynamic_ type as the receiver.

    Can you give a link on how path-dependent types help? I've only read about path-dependent types in the context of Scala, and Scala doesn't solve the problem.

    Also, the problem with inheriting F[t] is much more subtle than with implementing F[u] -- it doesn't break type safety, just allow using operations that discard new data in derived classes. But I'm not sure what else you can do in languages with subtyping.

Page 1 of 2 (24 items) 12