To box or not to box, that is the question

To box or not to box, that is the question

Rate This
  • Comments 59

Suppose you have an immutable value type that is also disposable. Perhaps it represents some sort of handle.

struct MyHandle : IDisposable
{
    public MyHandle(int handle) : this() { this.Handle = handle; }
    public int Handle { get; private set; }
    public void Dispose()
    {
        Somehow.Close(this.Handle);
    }
}

You might think hey, you know, I'll decrease my probability of closing the same handle twice by making the struct mutate itself on disposal!

public void Dispose()
{
    if (this.Handle != 0)
      Somehow.Close(this.Handle);
    this.Handle = 0;
}

This should already be raising red flags in your mind. We're mutating a value type, which we know is dangerous because value types are copied by value; you're mutating a variable, and different variables can hold copies of the same value. Mutating one variable does not mutate the others, any more than changing one variable that contains 12 changes every variable in the program that also contains 12. But let's go with it for now.

What does this do?

var m1 = new MyHandle(123);
try
{
  // do something
}
finally
{
    m1.Dispose();
}
// Sanity check
Debug.Assert(m1.Handle == 0);

Everything work out there?

Yep, we're good. m1 begins its life with Handle set to 123, and after the dispose it is set to zero.

How about this?

var m2 = new MyHandle(123);
try
{
  // do something
}
finally
{
    ((IDisposable)m2).Dispose();
}
// Sanity check
Debug.Assert(m2.Handle == 0);

Does that do the same thing? Surely casting an object to an interface it implements does nothing untoward, right?

.

.

.

.

.

.

.

.

Wrong. This boxes m2. Boxing makes a copy, and it is the copy which is disposed, and therefore the copy which is mutated. m2.Handle stays set to 123.

So what does this do, and why?

var m3 = new MyHandle(123);
using(m3)
{
  // Do something
}
// Sanity check
Debug.Assert(m3.Handle == 0);

.

.

.

.

.

.

.

.

.

.

 

Based on the previous example you probably think that this boxes m3, mutates the box, and therefore the assertion fires, right?

Right?

Is that what you thought?

You'd be perfectly justified in thinking that there is a boxing performed in the finally because that's what the spec says. The spec says that the "using" statement's expansion when the expression is a non-nullable value type is

finally
{
  ((IDisposable)resource).Dispose();
}

However, I'm here today to tell you that the disposed resource is in fact not boxed in our implementation of C#. The compiler has an optimization: if it detects that the Dispose method is exposed directly on the value type then it effectively generates a call to

finally
{
  resource.Dispose();
}

without the cast, and therefore without boxing.

Now that you know that, would you like to change your answer? Does the assertion fire? Why or why not?

Give it some careful thought.

.

.

.

.

.

.

.

.

.

.

The assertion still fires, even though there is no boxing. The relevant line of the spec is not the one that says that there's a boxing cast; that's a red herring. The relevant bit of the spec is:

A using statement of the form "using (ResourceType resource = expression) statement" corresponds to one of three possible expansions. [...] A using statement of the form "using (expression) statement" has the same three possible expansions, but in this case ResourceType is implicitly the compile-time type of the expression, and the resource variable is inaccessible in, and invisible to, the embedded statement.

That is to say, our program fragment is equivalent to:

var m3 = new MyHandle(123);
using(MyHandle invisible = m3)
{
  // Do something
}
// Sanity check
Debug.Assert(m3.Handle == 0);

which is equivalent to

var m3 = new MyHandle(123);
{
  MyHandle invisible = m3;
  try
  {
    // Do something
  }
  finally
  {
    invisible.Dispose(); // No boxing, due to optimization
  }
}
// Sanity check
Debug.Assert(m3.Handle == 0);

It is the invisible copy which is disposed and mutated, not m3.

And that's why the compiler can get away with not boxing in the finally. The thing that it is not boxing is invisible and inaccessible and therefore there is no way to observe that the boxing was skipped.

Once again the moral of the story is: mutable value types are enough pure evil to turn you all into hermit crabs, and therefore should be avoided.

  • foreach and the special structs on the generic collections Enumerators would like a word :)

    At least the expansion of that sugar is reasonable (as in doesn't surprise most people)

  • While totally agreeing with everything you've said, mutable value types could be an argument for adding support for C++ style copy constructors and out-of-scope destructors for C# value types.

  • I think all readers of your blog now agree that mutable value types are evil... But I wonder why this rule is so often not applied in the .NET Framework classes (DictionaryEntry, GCHandle, Point, Size...)

  • @Thomas Levesque: I don't think that "mutable value types are evil" was known in the .NET 1.0 timeframe, at least not as well as we know it now, and all of your examples are from 1.0. A case could be made for e.g. List<T>.Enumerator, though you're not normally supposed to directly access that type, so it's (plausibly) less of an issue.

  • This self-mutating struct is indeed very nasty, very evil. Let's never speak of it again.

  • GCHandle's mutability is hidden far away from the average user.

    There are often places that platform designers are doing this that appear evil but are well understood.

    I think Point is a legacy of trying to play along with POINT: msdn.microsoft.com/.../dd162805%28v=vs.85%29.aspx, likewise Size

    It's still quite nasty.

    DictionaryEntry is pure evil

  • @Jonathan Pryor, you're right of course... but WPF (.NET 3.0) also has its own Point and Size structures, and they're also mutable.

    @Shuggy, yes, I think that's because it's mapped directly to the unmanaged structure...

  • The WPF structs (like Point and Size) have to be mutable because XAML cannot create immutable objects (it can only call default constructors and set properties). The real question is why they are structs, seeing as how they have to be mutable. I'd guess that it's an optimization. You could potentially have millions of Point objects, and performance could really suffer if the memory footprint doubled and each one had to be individually allocated, initialized, and GCed.

  • Re: "Suppose you have an immutable value type that is also disposable".

    Suppose you have an invisible purple dragon.

    Disposing the value changes its state (though not necessarily the bits of the struct).  If its state can change, then it isn't immutable.

  • @Gabe @Thomas: XAML can create immutable objects just fine (it can create strings and other primitives, after all), but Xaml2006 has no way of explicitly declaring instances of immutable types with non-default constructors.  This renders immutable types with more than one member somewhat useless in XAML.  Your only option is to declare a default TypeConverter such that you could put a string representation in XAML and have it converted to an instance of an immutable type.  Then you could do this:

    <SomeObject Location="(0, 2)" />

    e.g., where 'Location' is an immutable 'Point' with (x, y) members..  Xaml2009 includes support for non-default constructors, but the syntax is relatively verbose, and Xaml2009 wasn't around in .NET 3.0, so we're stuck with some mutable value types.

  • "foreach" is different - per language spec, it does not expand to using but rather directly to try/finally, and the spec also explicitly says that, for enumerators which are value types, Dispose() is called directly without boxing. The difference is that foreach calls different methods several times on the enumerator, so you can write a perverted implementation that would be able to spot boxing on that final Dispose() call.

  • I can not agree wholesale with the statement that "mutable value types are enough pure evil to turn you all into hermit crabs." The real problem is when we try to treat values as objects. This is a flaw in the "everything is an object" concept. When a struct implements an interface (especially if the interface implies mutability) you are treating a value like an object. Your example is analogous to the much simpler scenario of assigning the value of 'int i' to 'int j', incrementing the value of j, and expecting i to change as well. (I understand that an integer is "immutable," but I take the position that however j comes to be incremented, it is an implementation detail, whether the processor chooses to twiddle some bits [mutate] or wholly overwrite a value. The difference is philosophical; Schrodinger's cat is doing our math.) As long as structs are used only as "complex values," and are designed in this spirit, things tend to go okay. (I've written up these thoughts in a more elaborate and drown out manner at snarfblam.com/words) If you're implementing an interface with a struct, that's a clear sign you're doing something wrong.

  • @snarfblam

    It most certainly isn't. Implementing an interface that implies mutability is, but many values types perfectly reasonably implement things like: IEquatable<T>, IFormattable, IComparable<T> to name but a few.

    Interfaces do *not* imply boxing (take a look at the constrained opcode if you want to know why) but this is an implementation detail anyway since immutability means it doesn't matter if you take a copy (in boxing) anyway.

  • @snarfblam

    actually I just read your blog and you have bigger problems.

    If you cannot differentiate between a variable and the value contained within the variable I think you need to have a serious think about your CS skills.

    compare:

    public class X

    {

       public readonly Point P2;

       public Point P2

       public readonly int X1;

       public int X2

    }

    and have a read of Eric's previous post: blogs.msdn.com/.../mutating-readonly-structs.aspx

    There are extreme tricks you can do: www.bluebytesoftware.com/.../WhenIsAReadonlyFieldNotReadonly.aspx but this is still not altering the value, it's altering the variable (by aliasing to the same location)

    Fundamentally immutable value types are different from mutable ones because a whole class of bugs simply cannot happen, especially in the circumstances where you treat one as an object.

    You who blog post conflates treating values like object and the important distinction to remember which is very simple. Value types have copy value semantics, reference types have "copy the reference value semantics", Eric of course, explains it better: blogs.msdn.com/.../the-stack-is-an-implementation-detail.aspx

  • @Shuggy: I dunno, he seems to have a pretty ok grasp: he's talking about integer values being immutable doesn't mean integer variables are. Also, wouldn't you like a way to call unexposed interface members without boxing? By the way, I beleive he was referring to treating values as objects philospically, not actual boxing, re: opcodes.

    As a C++ dev, I also can't agree with the /idea/ of mutable value types as evil. I think them having *exactly the same syntax* is bad, but I can't think of a good alternative which keeps focus on reference types. But simply changing the syntax coloring of value types solves that issue. (I also change the interface and delegate colors in case someone is doing something silly in a library). Perhaps allowing explicit references to values would help? I don' tknow. The real issue, as always, is whether the programmer understands the language - something that language design can never really ensure (though it certainly has a huge effect!).

Page 1 of 4 (59 items) 1234