What's the difference, part one: Generics are not templates

What's the difference, part one: Generics are not templates

Rate This
  • Comments 32

Because I'm a geek, I enjoy learning about the sometimes-subtle differences between easily-confused things. For example:

  • I'm still not super-clear in my head on the differences between a hub, router and switch and how it relates to the gnomes that live inside of each.
  • Hunks of minerals found in nature are rocks; as soon as you put them in a garden or build a bridge out of them, suddenly they become stones.
  • When a pig hits 120 pounds, it's a hog.

I thought I might do an occasional series on easily confounded concepts in programming language design. 

Here’s a question I get fairly often:

public class C
{
  public static void DoIt<T>(T t)
  {
    ReallyDoIt(t);
  }
  private static void ReallyDoIt(string s)
  {
    System.Console.WriteLine("string");
  }
  private static void ReallyDoIt<T>(T t)
  {
    System.Console.WriteLine("everything else");
  }
}

What happens when you call C.DoIt<string>? Many people expected that “string” is printed, when in fact “everything else” is always printed, no matter what T is.

The C# specification says that when you have a choice between calling ReallyDoIt<string>(string) and ReallyDoIt(string) – that is, when the choice is between two methods that have identical signatures, but one gets that signature via generic substitution – then we pick the “natural” signature over the “substituted” signature. Why don’t we do that in this case?

Because that’s not the choice that is presented. If you had said

ReallyDoIt("hello world");

then we would pick the “natural” version. But you didn’t pass something known to the compiler to be a string. You passed something known to be a T, an unconstrained type parameter, and hence it could be anything. So, the overload resolution algorithm reasons, is there a method that can always take anything? Yes, there is.

This illustrates that generics in C# are not like templates in C++. You can think of templates as a fancy-pants search-and-replace mechanism. When you say DoIt<string> in a template, the compiler conceptually searches out all uses of “T”, replaces them with “string”, and then compiles the resulting source code. Overload resolution proceeds with the substituted type arguments known, and the generated code then reflects the results of that overload resolution.

That’s not how generic types work; generic types are, well, generic. We do the overload resolution once and bake in the result. We do not change it at runtime when someone, possibly in an entirely different assembly, uses string as a type argument to the method. The IL we’ve generated for the generic type already has the method its going to call picked out. The jitter does not say “well, I happen to know that if we asked the C# compiler to execute right now with this additional information then it would have picked a different overload. Let me rewrite the generated code to ignore the code that the C# compiler originally generated...” The jitter knows nothing about the rules of C#.

Essentially, the case above is no different from this:

public class C
{
  public static void DoIt(object t)
  {
    ReallyDoIt(t);
  }
  private static void ReallyDoIt(string s)
  {
    System.Console.WriteLine("string");
  }
  private static void ReallyDoIt(object t)
  {
    System.Console.WriteLine("everything else");
  }
}

When the compiler generates the code for the call to ReallyDoIt, it picks the object version because that’s the best it can do. If someone calls this with a string, then it still goes to the object version.

Now, if you do want overload resolution to be re-executed at runtime based on the runtime types of the arguments, we can do that for you; that’s what the new “dynamic” feature does in C# 4.0. Just replace “object” with “dynamic” and when you make a call involving that object, we’ll run the overload resolution algorithm at runtime and dynamically spit code that calls the method that the compiler would have picked, had it known all the runtime types at compile time.

  • Ethernet used to consist of a single piece of cable trailing round the building with each computer attached to it by a short piece of cable.  So only one computer could (usefully) transmit at a time.

    A hub replaces the long cable.  Each computer connects to the hub using twisted pair cable.

    A switch is a smart hub that allows different pairs of computers to communicate at the same time.

    A router routes packets between networks.

    This doesn't tell the whole story (e.g. hubs can be arranged in a tree).  And yes, it has nothing to do with the point you were making.

  • Using dynamic in this way is how I now choose to implement the C# equivalent of partial specialization. Prior to dynamic, the only alternatives were to use extension methods (which was fragile and confusing) or implement your own dynamic dispatch using delegates.

  • What happens if you have a type constraint applied to T? Such as:

     public static void DoIt<T>(T t) where T : String

    ... bad example constraining to the type string (since string is sealed it makes for a *very* specific constraint), but you get the idea.

    Such a constraint is illegal. If you wanted T to be only string then why would you make it generic in the first place? -- Eric

  • I didn't realize that such a constraint was actually illegal. I've never encountered it since it doesn't make sense.

    A constraint of "object" is also illegal, for related reasons. It is possible to produce an otherwise illegal constraint on a generic method type parameter, but you have to work at it:

    class B<T> { public virtual void M<U>() where U : T { } }
    class D: B<string> { public override void M<U>() {} }

    The constraint on U in D's version of M<U> is sealed type string, which would not be legal in any other situation. This oddity hits some corner cases in the CLR and causes a great deal of difficulty in type analysis and code gen; I'll blog someday about how I've screwed it up multiple times. -- Eric

    Here is a legal example:

    interface IThing {}
    public class C
    {
     public static void DoIt<T>(T t) where T: IThing
     {
       ReallyDoIt(t);
     }
     private static void ReallyDoIt(IThing t)
     {
       System.Console.WriteLine("IThing");
     }
     private static void ReallyDoIt<T>(T t)
     {
       System.Console.WriteLine("everything else");
     }
    }

    Answer: IThing is printed, as should be expected.

    Exactly. At compile time we know that the specific version is better than the generic version. -- Eric

  • eric - why do these posts not appear simultaneously in your RSS feed?

  • A hub is an indiscrete gossip that repeats everything it hears on one port to all its other ports.

    A switch is a discrete messenger which delivers anything it hears on any port only to those ports that need to know.

    a router is an awkward post master that won't deliver anything if it isn't properly addressed.

  • David Morton actually had a great blog post about how it will be different in C# 4.0 using the dynamic keyword (what Eric mentioned at the end of his post). See it here: http://blog.davemorton.net/2009/05/dynamic-type-and-runtime-overload.html

  • Just a "reminder" to everyone...

    dynamic will NOT come close to duplicating what C++ template do....it is something completely different [not better or worse...but very different]

    C++ templates do ALL of their work at compile time. There are NO runtime decisions made. So from a performance perspective, templares and (pre-dynamic) generics have very similar performance characteristics.

    When dynamic is used, all of the work gets moved into the runtime. Of course much will depend on the internals of C#/CLR and other aspects will depend on usage; but just consider what would happen (to performance) if a person writes image procesing code there each pixel gets passed to a method as a parameter that is "dynamic".....

    I am just waiting until we see an "explosion" of places where dynamic is used because it "seemed like a good idea", and the application suffers enough that my company gets called in to address "performance issues".

  • TheCPUWizard said: So from a performance perspective, templares and (pre-dynamic) generics have very similar performance characteristics.

    This is definitely not true.  An array type meets the constraints for IEnumerable.  If I write a foreach loop against a type parameter (constrained to IEnumerable), the behavior is very different.  With templates, I get efficient array iteration with no bounds checking.  With generics, I get virtual calls to MoveNext and Current.  If the template wants to do something special for arrays and std::vector (because it knows the data are stored contiguously) it can.  Generics can't.

    .NET manages to do spot optimization of generics only when working with type parameters which are value types, otherwise the optimizer is pretty much helpless.

    And this doesn't even begin to address the things that parameters that aren't types enable for templates... many of which are huge perf wins.

    Of course, templates being fully instantiated at compile-time, don't provide any mechanism for extensibility.  Dynamic does (at a price of course).

  • Oh, and for the other matter:

    Hubs repeat all incoming traffic to all other ports.  Bridging hubs are a little better, they work on a packet level and can queue up traffic when there's a collision instead of discarding it (needed for say translating between segments with different speeds).

    Switches look at the layer 2 (ethernet, typically) destination address to decide which port to forward a packet through.  Usually the list of addresses reachable via each port is learned automatically by observing traffic.  Packets where the connectivity of the destination isn't known are flooded to all ports, like the hub.

    Routers look at the layer 3 (IP, typically) destination address to decide which port to forward a packet through.  The list of addresses (grouped into binary blocks) reachable via each port is managed either through manual configuration or exchange with peer routers.  Packets where the connectivity of the destination isn't known are dropped.

  • Hi Eric.

    Great Post (as usual).

    I'd really appreciate it if you could do a post on this (if you haven’t already).

    ---------------------------------------------------------------------------------------

    using System;  

    namespace OhDear  

    {  

       class Program  

       {  

           static void Main()  

           {  

               Do(() => { }, question => { });  

               Do(() => { throw new Exception("test"); }, question => { });  

               Do(() => { }, (Exception question) => { });  

               Do(() => { throw new Exception("test"); }, (Exception question) => { });  

           }  

           static void Do(Action action, Action<Exception> errorHandler)  

           {  

               Console.WriteLine("ONE");  

           }  

           static void Do<T>(Func<T> action, Action<T> callBack)  

           {  

               Console.WriteLine("TWO");  

           }  

       }  

    }  

    Expected output;

    ONE

    ONE

    ONE

    ONE

    Actual output;

    ONE

    ONE

    ONE

    TWO

  • "This illustrates that generics in C# are not like templates in C++."

    So was the choice to give them a syntax very much like templates in C++ done out of a desire to deliberately confuse programmers?

    Something I've picked up in my years as a developer is to not create something, e.g. an API, that looks and feels a lot like an existing something else which is similar (call it "A"), but whose use cases and/or implementation is actually different in subtle and confusing ways. Either make it so the implementation is not different from "A" at all, or make it so that the implementation notices and squawks loudly if you try to use it as if it were an "A", or if none of those are possible make it so that it looks as different from "A" as you possibly can.

    Has anyone else found this?

  • "So was the choice to give them a syntax very much like templates in C++ done out of a desire to deliberately confuse programmers?"

    C# has other confusing areas. For instance, C# uses the C++ ~destructor syntax, but the behavior is (as we know) very different. This is very confusing for C++ programmers.

  • Ben Voight wrote: "An array type meets the constraints for IEnumerable.  If I write a foreach loop against a type parameter (constrained to IEnumerable), the behavior is very different.  With templates, I get efficient array iteration with no bounds checking.  With generics, I get virtual calls to MoveNext and Current.  If the template wants to do something special for arrays and std::vector (because it knows the data are stored contiguously) it can.  Generics can't."

    You are 100% correct. Your example brings in other differences between C++ and C# (e.g. how arrrays are handled, differences between value and reference).  Also the fact that std::vector is itself a template, and C++ can do "wonderful" things then templates are combined.

    On the other hand, iy you used an example where you have (psuedo code)

    class Base { vitrual void f(); }

    class Leaf : Base { virtual void f(); }

    And create a template  <Leaf &> or generic <Leaf>, then the effects of calling "f()" will be identical between the two (both will be virtual calls)

    The main point I was trying to illustrate, is that NEITHER C++ templates not C# generics make any "decisions" at execution time of a method.

  • Steven said:"C# has other confusing areas. For instance, C# uses the C++ ~destructor syntax, but the behavior is (as we know) very different. This is very confusing for C++ programmers."

    Why not: "C++ has other confusing areas. For instance, C++ uses the C# ~destructor syntax, but the behavior is (as we know) very different. This is very confusing for C# programmers."

    Oer the past 37 years, I have programmed in dozens of "high level" languages, not to mention close to 50 different assembly languages. I really hate to think how convoluted things would be if each environment avoided syntactical constructs (e.g. assembly language mnemonics)  that had previously been used every time there was a difference in behaviour......

Page 1 of 3 (32 items) 123