Welcome to MSDN Blogs Sign in | Join | Help

The subtleties of Boxing

A lot of people struggle to understand what's up with Boxing in .NET. It's really pretty simple. When you cast a value type to a reference variable, the data of the valuetype is copied into the heap and *boxed* into an object. It makes for easy usage of valuetypes through references, but it also duplicates the data. You have to be careful about operating on the duplicated data.

Here's a simple example of how boxing can bite you if you don't see it happening. This code is written in VB.NET, but it also happens in C# (although you need to add a method to the struct to decrement the fee for C# to do this). The code creates 3 bank accounts, charges a fee, and checks for the fee. You'll see that the fee is lost in this demo. The explanation is below the code:

Module Module1

    Sub Main()

        Dim al As New ArrayList

        Dim a As Account

        a.Balance = 300

        Dim a2 As Account

        a2.Balance = 400

        Dim a3 As Account

        a3.Balance = 500

        al.Add(a)

        al.Add(a2)

        al.Add(a3)

        For Each ai2 As Account In al

            ai2.Balance -= 5

            '*** show that the fee was applied

            System.Console.WriteLine(ai2.Balance)

        Next

        '*** changes are lost (or so it seems)

        For Each ai As Account In al

             System.Console.WriteLine(ai.Balance)

        Next

    End Sub

End Module

Structure Account

    Public Balance As Integer

End Structure

OK, for starters, the example is a bit contrived, I know. But it does serve to make the point nicely. An arraylist is used to hold the accounts. The type of an array list member is Object, which is a ref type. When we add the accounts to the arraylist, the accounts (value types) are boxed, and the boxed copy is actually added to the collection, not the original.

But wait. There's more. When we do the For Each loop, we are assigning the boxed account (now playing as a ref type) to a value type variable, and the data is unboxed into a 3rd location (a value type on the stack). The fee is charged on the 3rd copy of the data, and when you increment, that 3rd copy is abandoned. Use the locals window to see the fun in action.

Finally, we iterate over the array list (holding the second copy of the data), and you see the unchanged values.

Is this a serious issue? Well, if you're a bank, and you charged all your customers a fee, and the fee disappeared, it's an issue! This is not a flaw in .NET, but rather a programmatic flaw. Most of the time, boxing is a help, not a hindrance. But be aware of it, because data loss is a poor feature ;-)

How do you avoid this?

1) Use Generics when they come out! Strongly typing the content of the aggregate would have eliminated the issue.

2) don't change data in a value type via a reference variable, or vice versa. For valuetypes, assignment = copy.

3) take a few minutes, walk through this, and figure out how boxing is affecting your data, and be more savvy in your assignments.

If you ever wonder, *is my data being boxed?*, there's a simple way to tell. Look for the box instruction in your IL. Remember, the truth is in the IL. You'll also see the unbox instruction in the IL.

Not really any new info, but it's a fun example to show, and describes an oft-misunderstood artifact of the .NET Framework.

Published Friday, February 18, 2005 12:21 AM by dougturn

Comments

# re: The subtleties of Boxing

Interesting... the c# compiler won't let you do this... - you get the error "The left-hand side of an assignment must be a variable, property or indexer"
Friday, February 18, 2005 4:24 AM by Mike Perrin

# That's true

This is one of the places where the C# compiler is *tighter* than the VB.NET compiler.

Instead of touching the field directly on the line where you're getting the compiler error, add a ChargeFee method to the BankAccount struct, and change the code to call the method instead of touch the field. Like this:

class BoxingClass
{
public static void Main()
{
BankAccount ba1 = new BankAccount();
BankAccount ba2 = new BankAccount();
BankAccount ba3 = new BankAccount();

ba1.balance = 100;
ba2.balance = 200;
ba3.balance = 300;

ArrayList al = new ArrayList();
al.Add(ba1);
al.Add(ba2);
al.Add(ba3);

// charge all accounts a 5$ fee
foreach(BankAccount ba in al)
{
ba.ChargeFee(5);
}

// see if it was correctly charged
foreach(BankAccount ba in al)
{
Console.WriteLine(ba.balance);
}
}
}

struct BankAccount
{
public int balance;

public void ChargeFee(int Fee) {
balance -= Fee;
}
}
Saturday, February 19, 2005 8:11 AM by Doug Turnure
New Comments to this post are disabled
 
Page view tracker