Too much type information, or welcome back System.Object and boxing

Too much type information, or welcome back System.Object and boxing

Rate This
  • Comments 22

We all know that generics are good - they promote code reuse, static type checking by the compiler, increase runtime performance, allow more flexible OOP designs, lay the foundation for LINQ, help the IDE to provide more helpful IntelliSense and have tons and tons of other vital advantages. "var" is another good feature, which (unlike "object"), also helps to preserve full static type information.

However I hit a rare case recently where I had too much static type information about my code, so I had to use System.Object (and boxing) to get the desired effect. I had a method that used reflection to set a property on a type, similar to this:

static void SetProperty(object f)
{
    Type type = f.GetType();
    PropertyInfo property = type.GetProperty("Bar");
    property.SetValue(f, 1, new object[0]);
}

I also had a struct like this:

struct Foo
{
    public int Bar { get; set; }
}

Now, I tried to set the Bar property on an instance of the struct:

static void Main(string[] args)
{
    var f = new Foo();
    SetProperty(f);
    Foo foo = (Foo)f;
    Console.WriteLine(foo.Bar);
}

It didn't work! It printed out 0! I was puzzled. And then I realized what is happening. Since Foo is a struct, and f (thanks to var!) is also statically known to be a struct, the compiler passes a copy of the struct by value to the SetProperty method. This copy is modified, but the original f is not.

One simple change and it started working fine:

static void Main(string[] args)
{
    object f = new Foo();
    SetProperty(f);
    Foo foo = (Foo)f;
    Console.WriteLine(foo.Bar);
}

I changed var to object, the struct was boxed into an object on the heap, the reference to this same object was passed to the SetProperty method, method set the property on the boxed instance, and (Foo) unboxed the same modified instance - the code now prints out 1 and everything is OK again.

"var" provided too much type information to the compiler - it avoided boxing, and knew that the variable is a struct, so I lost the modified value. After casting to object, we hid the extra information from the compiler and got the uniform behavior for both value types and reference types.

In my original code where I encountered this peculiar behavior (a custom deserializer that reads XML and uses reflection to set properties on objects), I was too focused on working with all types so I forgot that those can be value types as well. Since I had everything strongly typed with generics, type inference, vars and other modern goodness, the kind hardworking compiler preserved all the information for me and avoided boxing where I was expecting to get reference type behavior. Thankfully, unit-tests revealed the error 10 minutes after it was introduced (I definitely need to post about the usefulness of unit-tests and TDD in the future), so it was a quick fix to box a type into object before filling its properties.

It was an amusing experience.

  • Hi,

    I saw this post on VS2008's start page...

    Thought this might serve as another good example why value types should be made immutable. The compiler did nothing wrong, it just faithfully infered the type out. The code wouldn't have work if the 'var' is replaced with its actual type anyway, and we all know that, right?

    Maybe a better solution would be getting rid of mutable value types in the design, if possible. That's much less error-prone.

  • Yes, mutable structs are evil. However the requirements for our object deserializer are such that it has to be universal and be able to deserialize both classes and structs.

    To deserialize an immutable struct, we would need to know about a constructor and actually create the value, not just set properties.

  • You wrote "We all know that generics are good...".  I started with basic, worked my way to C, C++, and Java, then VB (I know, really obleque turn), before I got to C#.  While I agree in part that Generics are good, I don't believe var is a good thing at all.  It's an evil little thing that says I don't know what I'm working with, so I'll blindly go forward.  The C family of languages is strongly typed to aviod such dangerous ideas.  Since everything in C# is derived from the Object class, even value types, there should be no reason to use var outside of being assigned an anonymous type, which can only be used safely in the function it is declared.

  • Why were you using a struct in the first place?

  • gps: I don't quite agree about the var part. Var preserves the static type of an expression without explicitly repeating it, thus avoiding redundance. Most functional languages use it in this or another form ("type inference"). There are trade-offs, yes (mostly around readability), but I don't see var being technically harmful. I use it in about 50% of cases and use personal judgement every time. I recommend reading Jon Skeet's "C# in Depth" if you want to learn more about this.

    Matthew: We have structs in code which we don't want to rewrite now. However it was being serialized manually and I was converting it to automatic serialization. That's why I needed the serializer to work with existing code.

  • var/type inference has its place in c#, e.g. linq. but that doesn't mean we should abuse it.

  • If you know it's a struct, your designed it as struct you know you have to pass the parameters as ByRef (VB) / ref (C#). Theres no mistake there. Three letters makes the difference and you won't have any problem, even if you pass an object.

  • And that's exactly why I hate mutable structs so much.  See my post on enforcing immutability at http://blogs.msdn.com/kevinpilchbisson/archive/2007/11/20/enforcing-immutability-in-code.aspx

  • @Luis,

    Passing the argument to SetProperty by reference is not appropriate, because it implies that after the method call the reference passed in could be pointing to a completely difference instance! Which is clearly not what the method is trying to accomplish.

  • @ Luis,

    The whole point is that you don't know in advance that you have a struct, or an object... Kirill has stated that this example has been pulled from an automatic serializer that he is working on... in that case, you need to be able to to pass it various objects without actually knowing what they are.

  • Another option would be to force structs to be passed by reference by using generics:

    static void SetProperty<T>(T f) where T : class

    {

    Type type = f.GetType();

    PropertyInfo property = type.GetProperty("Bar");

    property.SetValue(f, 1, new object[0]);

    }

    static void SetProperty<T>(ref T f) where T : struct

    {

    object fObj = (object)f;

    SetProperty(fObj);

    f = (T)fObj;

    }

    This will throw a compile error if you call SetProperty with a value type without specifying it as a ref parameter.   Obviously, this won't work if the calls should be the same for object and value types:

    static void Main(string[] args)

    {

    var f1 = new Foo();

    SetProperty(ref f1);

    Foo foo1 = (Foo)f1;

    Console.WriteLine(foo1.Bar);

    object f2 = new Foo();

    SetProperty(f2);

    Foo foo2 = (Foo)f2;

    Console.WriteLine(foo2.Bar);

    }

  • @Kirill

    Certainly var has its place as a 'type' name for variables holding the return values of LINQ queries that will be anonymous types generated by the compiler, but in situations where the programmer knows the type in advance, it seems like it invites issues like this. I would personally recommend coding standards that explicitly forbade its use in those situations for that reason.

  • Well, as I said, I've never hit this case before, so I thought var can do no harm. I used to think carefully everytime I needed to declare a local variable, and I now I will think even more carefully. But I still love var and I expect myself using it in the future as well where appropriate (I'll just have to be more careful). Not necessarily for anonymous types (which I rather almost never use), but also where it increases readability and the type is clear from the variable name/ambience.

  • "To deserialize an immutable struct, we would need to know about a constructor and actually create the value, not just set properties."

    I don't understand this. If you are using reflection, can you not still deserialize a struct by setting the fields rather than the properties? After all, the framework somehow knows how to deserialize "immutable" value types... Since inheritance is not an issue, you know all the fields in a value type will be DeclaredOnly, and there will always be a parameterless default constructor suitable for Activator.CreateInstance().

    At the very least, if you want to use properties and make it immutable by normal means, just use a "private set" and then let reflection find that.

  • Hi Bruce,

    all your comments are very valid. From a couple of hints I see that you clearly know what you're talking about. However we have a couple of requirements:

    1. We want to keep our serializer/deserializer very simple, maintainable (500 lines of code for deserializer and 150 lines of code for serializer) and keep full control over it

    2. We only serialize public writable instance properties, we don't even look at fields

    3. The list of participating properties is returned by a piece of common reflection logic that we want to keep really simple/trivial

    4. This is not shipping code, so we just want to get it working and move on - my solution turned out to be the best in terms of cost/benefit - quick, maintainable and does the job.

Page 1 of 2 (22 items) 12
Leave a Comment
  • Please add 1 and 3 and type the answer here:
  • Post