September, 2008

  • Kirill Osenkov

    More comments on generics

    • 9 Comments

    Here are some more unsorted thoughts on generics to continue this post (which has some interesting comments too).

    Preserving types with generic type inference

    To reiterate, instead of having a method like:

    void Process(object instance)
    {
    }

    one can write:

    void Process<T>(T instance)
    {
    }

    and thus preserve the static type information about the parameter. One can still call this method like Process(stuff) without having to explicitly specify the generic type parameter: Process<StuffType>(stuff).

    Casting by example

    This technique can be useful (among others) in one specific situation - where we can't specify the generic type parameter, because it is of an anonymous type. We can still call a generic method and "specify" the anonymous type parameter using type inference (more precisely, the smart C# compiler will do it for us). I blogged more about this here: http://kirillosenkov.blogspot.com/2008/01/how-to-create-generic-list-of-anonymous.html

    Static data is not shared among constructed generic types

    Jacob Carpenter reminds us of this important (and useful!) fact: http://jacobcarpenter.wordpress.com/2008/07/21/c-reminder-of-the-day/

    You can constrain a generic type parameter on a type to be its derived type

    Darren Clark mentions another nice trick with generics:

    public class MyBase<T> where T: MyBase<T>

    Here, we essentially limit T to be a type derived from MyBase<T>.

    Limits of type inference

    C# 3.0 compiler is much smarter than the 2.0 when it comes to type inference. Jon Skeet described this really well in his book C# In Depth. However there is still room for future improvement - the type inferencing engine can be made even smarter to deduce types from the generic constraints. I hit this limitation recently when I tried to implement the topological sorting algorithm as a mix-in (using extension methods and generic type inference). Paste this program and it will compile fine, however if you remove the generic arguments in the call to TopologicalSort in Main(), it will fail to compile:

    using System.Collections.Generic;
    
    // Generalized algorithms and data-structures - I want to reuse them by "mixing-in"
    interface IDependencyNode<TNode, TNodeList>
        where TNode : IDependencyNode<TNode, TNodeList>
        where TNodeList : IDependencyList<TNode>
    {
        // TNodeList Dependencies { get; } <-- that's why I need TNodeList
    }
    
    interface IDependencyList<TNode> : IEnumerable<TNode> { }
    
    static class Extensions
    {
        public static IEnumerable<TNode> TopologicalSort<TNode, TNodeList>(this TNodeList nodes)
            where TNode : IDependencyNode<TNode, TNodeList>
            where TNodeList : IDependencyList<TNode>
        {
            return null; // algorithm goes here
        }
    }
    
    // Mixing-in to my concrete world of Figures and FigureLists
    // I basically get the implementation of topological sort for free
    // without inheriting from any classes
    class Figure : IDependencyNode<Figure, FigureList> { }
    
    class FigureList : List<Figure>, IDependencyList<Figure> { }
    
    class Program
    {
        static void Main(string[] args)
        {
            FigureList list = new FigureList();
            // wouldn't it be sweet if we could infer the type arguments here??
            // list.TopologicalSort(); // doesn't compile
            list.TopologicalSort<Figure, FigureList>(); // compiles fine
        }
    }

    In this case, one could potentially figure out the types Figure and FigureList (at least, it is doable by a human :), but then the type inferencing algorithm becomes even more complex than it is now (in fact, I suspect that it would become as powerful as a typical Prolog solver because it would require unification). The C# compiler team has certainly higher priority tasks now than implementing Prolog into the C# compiler.

    Finally, when I look at the code above, it is too complex, difficult to understand and clumsy. One shouldn't pay such a high price for the flexibility of mix-ins. There is a much more simple and elegant solution to the problem which I hope to come back to later.

    kick it on DotNetKicks.com
  • Kirill Osenkov

    Too much type information, or welcome back System.Object and boxing

    • 22 Comments

    We all know that generics are good - they promote code reuse, static type checking by the compiler, increase runtime performance, allow more flexible OOP designs, lay the foundation for LINQ, help the IDE to provide more helpful IntelliSense and have tons and tons of other vital advantages. "var" is another good feature, which (unlike "object"), also helps to preserve full static type information.

    However I hit a rare case recently where I had too much static type information about my code, so I had to use System.Object (and boxing) to get the desired effect. I had a method that used reflection to set a property on a type, similar to this:

    static void SetProperty(object f)
    {
        Type type = f.GetType();
        PropertyInfo property = type.GetProperty("Bar");
        property.SetValue(f, 1, new object[0]);
    }

    I also had a struct like this:

    struct Foo
    {
        public int Bar { get; set; }
    }

    Now, I tried to set the Bar property on an instance of the struct:

    static void Main(string[] args)
    {
        var f = new Foo();
        SetProperty(f);
        Foo foo = (Foo)f;
        Console.WriteLine(foo.Bar);
    }

    It didn't work! It printed out 0! I was puzzled. And then I realized what is happening. Since Foo is a struct, and f (thanks to var!) is also statically known to be a struct, the compiler passes a copy of the struct by value to the SetProperty method. This copy is modified, but the original f is not.

    One simple change and it started working fine:

    static void Main(string[] args)
    {
        object f = new Foo();
        SetProperty(f);
        Foo foo = (Foo)f;
        Console.WriteLine(foo.Bar);
    }

    I changed var to object, the struct was boxed into an object on the heap, the reference to this same object was passed to the SetProperty method, method set the property on the boxed instance, and (Foo) unboxed the same modified instance - the code now prints out 1 and everything is OK again.

    "var" provided too much type information to the compiler - it avoided boxing, and knew that the variable is a struct, so I lost the modified value. After casting to object, we hid the extra information from the compiler and got the uniform behavior for both value types and reference types.

    In my original code where I encountered this peculiar behavior (a custom deserializer that reads XML and uses reflection to set properties on objects), I was too focused on working with all types so I forgot that those can be value types as well. Since I had everything strongly typed with generics, type inference, vars and other modern goodness, the kind hardworking compiler preserved all the information for me and avoided boxing where I was expecting to get reference type behavior. Thankfully, unit-tests revealed the error 10 minutes after it was introduced (I definitely need to post about the usefulness of unit-tests and TDD in the future), so it was a quick fix to box a type into object before filling its properties.

    It was an amusing experience.

  • Kirill Osenkov

    Why a comparison of a value type with null is a warning?

    • 4 Comments

    A reader (Petar Petrov) asked me a question recently which I didn't quite know how to answer:

    Why a comparison of a value type against null is a warning? I definitely think it should be a compiler error.

    So I asked the C# compiler team and here's the explanation (please welcome today's special guest Eric Lippert):

    Why is it legal?


    It's legal because the lifted comparison operator is applicable. If you are comparing an int to null then the comparison operator that takes two int?s is applicable.

    Why is it a warning?

    It's a warning because the comparison always results in "false".

    Why is that not an error?

    Let's turn that around -- why should it be an error?  Why should any comparison which the compiler knows the answer to be an error?  That is, if you think this should be an error, then why shouldn't
    if (123 == 456)
    be an error?  Or for that matter, why shouldn't
    if (false)
    be an error?

    Three reasons why none of these things should be errors:
    First, the argument that this is work for me. The spec is complicated enough already and the implementation is divergent from the spec enough already; let's not be adding even more special cases that I can then get wrong for you.

    Second, the argument from design. By design, C# is an "enough rope" programming language -- we do not try to constrain you to writing only meaningful programs. Rather, we let you write almost any program, and then give you warnings when it looks like you might be entangling yourself in the rope we gave you. If you don't like that, choose a language that gives you less flexibility.  (This is part of the impetus behind the push towards more declarative programming languages; declarative programming languages are less likely to contain senseless commands because they consist of descriptions of how things are desired to be.)

    Third, the argument about generated code. We do not disallow statements like if (12 == 13) { whatever... } because not all code is typed in by humans. Some of it is generated by machines, and machines often follow the same rigid rules generating code that compilers do consuming it. Do we really want to put the burden upon machine-generated-code providers that they must jump through the same constant-folding hoops that the C# compiler does in order to avoid compiler errors? Do we want to make machine-generated code not only have to be syntactically and grammatically correct C# code, but also _clever_ and _well-written_ C# code? I don't think we do; I think that makes the job of both the code producer and the compiler writer harder without any corresponding gain in safety or productivity.

    Eric

Page 1 of 1 (3 items)