Welcome to MSDN Blogs Sign in | Join | Help

D Melodic Minor Accidental

A conglomeration of random thoughts on C#, .NET, OOP, and working at Microsoft

Syndication

Other Sites

Referring to "value types" versus "reference types", Part 2

Last Friday, I blogged about confusion between the terms "value" and "reference", when relating to .NET development.  The post can be found here:

http://blogs.msdn.com/theoy/archive/2005/06/24/ValueRef.aspx

This time, I'm continuing the discussion specifically relating to value types and reference types.  After this, passing by value / passing by reference will be covered, as well as referring to an object's "value" or its "reference". So, continuing... I left off stating that I still have the following topics to cover, as part of Referring to "value types" versus "reference types":

   (1) Enums
   (2) Type unification, boxing/unboxing
   (3) Interfaces and inheritance for value types

Enums

.NET has built-in support for a special kind of type, called an enum type.  Enums are also value types, and have all of the traits that I described for value types in the previous blog entry, except for some small differences that are not relevant to this topic.  One way to look at them is as specialized value types.  For example, .NET enum types don't have instance fields, nor methods or properties.  Basically they exist so that one can specify names for what would otherwise be numeric values.  Their syntax is as follows:

    C#/J#:
        enum
MyEnumType {
        }

    VB.NET (requires all enum types define at least one value):
        Enum MyEnumType
            OneSampleValue
        End Enum

Each of the languages will automatically assign values to the enums if you don't care, but also will let you define specific values to assign to their corresponding names if you wish.  As you'd expect, they all compile down to the same thing under the hood, which is one of the reasons why .NET language integration is so streamlined between C#/VB/J#.  However, when contrasting value types and reference types, a user only needs to know that enums are in the "value types" camp.

Type Unification, Boxing and Unboxing

This is when it gets a little hairy... so in the first article, I described value types getting allocated on the stack, and reference types on the heap.  However, there are some leftover questions that you're probably pondering, such as:

(1) If anything that doesn't extend from System.ValueType is a reference type, then how does the CLR tell the difference between value types and the rest of the types that descend directly/indirectly from System.Object?
(2) How do value types implement interfaces, and how are those interface implementations distinguished between those of reference types?
(3) Is System.ValueType itself, a value type or a reference type?

Those are all good questions, and require understanding of the CLR type unification system.  Java has less of this to worry over, since it doesn't support custom value types (it only supports its built-in "primitive types").  In fact, prior to Java 1.5/5.0, users had to explicitly convert their primitive types to their corresponding wrapper objects (e.g. java.lang.Integer).  In addition, users of reflection on Java would have to deal with two different class objects for each primitive and its corresponding wrapped type.  Not fun.

Thus for .NET, there came a need to unify this more - and they introduced a concept of boxing and unboxing.  You can think of boxing and unboxing as being another special form of copying, for a value type (in fact, one must realize that boxing and unboxing performs copies).  However, the distinction between boxing and unboxing from other forms of copying (say assignment, parameter passing or receiving return value) is that boxing places a value type inside a wrapper (a "box") where it *then* behaves like a heap object.  By wrapping value types as heap objects, they can then be treated the same way as other heap objects.  .NET has built-in support for object-oriented type systems - and likewise has to implement virtual methods similar to pre-.NET C++, and Java.  You probably guessed that virtual methods require some sort of special handling, that makes them different from non-virtual methods.  Since any subclass an override a non-sealed virtual method, there needs to be some infrastructure to look up which function to actually run.

The way that those usually work is that an object in memory usually has a pointer (just an address value to some other chunk of memory) that points back to a shared virtual method table (or v-table), since in .NET and Java, overriding a method overrides it for all instances of that type.  Then the object can be passed around as any super type that it implements, and when a function call is required, a lookup of the appropriate index in the v-table will allow for code to pick up the overriden method.  As long as the means to access the v-table pointer are consistent, and the indices are consistent, then a function for one type need not worry about if it is executing on object that's actually an instance of a subclass.

This is all fine and dandy, but once you're passing around pointers to objects, and you're trying to avoid examining what type they really are, then how do you know if you're dealing with an implementation derived from a value type, or a reference type? The answer is, by default, you don't.  That's why value types need to be boxable into heap objects, so that code that consumes any interfaces that they implement may be able to dispatch off of those objects without having to concern themselves on whether or not that object is a value type or reference type.  Thus, it makes things easier to realize that when dealing with objects at an interface type-level, you're always dealing with heap objects - either reference type objects, or boxed value type objects.

The reason why I discuss "unification" is that in .NET, this wrapped value type support is automatically generated for you whenever you define a new value type.  It would be a nightmare if each user had to define their own heap wrapper type for every new value type they defined.  That's why Java's explicit type wrappers aren't a scalable solution if they were to allow users to define their own primitives.  Also, when using reflection, boxed and unboxed value types share the same type object (e.g. if you use C#'s typeof() keyword).  This means that if you're dealing with boxed or unboxed objects using reflection, you can confidently know that testing them against a type object will do what you'd expect (another issue with Java's having two separate class objects for wrapped and unwrapped primitives).

The last tricky part is that the type, System.ValueType itself, is not a value type.  Users can only create objects of types that derive from System.ValueType, and using a value type as an instance of System.ValueType causes it to be boxed.  The reason why is that once you're dealing with a value type at the System.ValueType level, you have very little knowledge remaining about the value type (for example, you have no idea what fields it has).  All you have are virtual methods that you can call, which require boxing, as you'd expect.  The same thing applies to System.Enum - once you're dealing with an enum value at the System.Enum level, you're actually dealing with a boxed enum.

As you can tell, System.ValueType and System.Enum have to be special-cased in the runtime, as their descendents are all value types.  In my opinion, there's not really a great way to structure them in the inheritance tree, but here's how they're structured in reality: (go, go, ascii art)

   System.Object
      |
      +- System.ValueType
      |   |
      |   +- System.Enum
      |   |   |
      |   |   +- <all enum-types>
      |   |
      |   +- <all-non-enum-value-types>
      |
      +-
<all-other-reference-types>

I use VS 2005 (Beta2 is pretty stable) when I code, and one of the things that I always change first is the fonts and colors preferences.  I set VS to colorize all enums and value types in navy, instead of teal, so that I can notice what my code is doing, and whether or not I'm going to copy, box, or unbox.  As you can see, I've actually colorized the types System.ValueType and System.Enum in teal, because those types specifically, are actually reference types.  It's their derived types that are value types.

Interface inheritance for Value types

Users defining their own value types will notice that they can specify interfaces for their value types to implement.  Yes, this is a powerful corollary to the unification described previously.  When transferred to a context that requires an interface to be implemented, value types get boxed, and are then indistinguishable from reference types, to the interface consumers.  What I mean by indistinguishable is that consumers needn't worry if they're dealing with value types or reference types - but they can of course query if the object in question also extends from System.ValueType (e.g. in C#, using the "is"/"as" keyword).

I think that pretty well summarizes most of the important differences between value types and reference types.  In the queue of topics to cover in my future blog posts will be some best-practices for value types.

Published Tuesday, June 28, 2005 1:31 PM by TheoY

Comments

# re: Referring to "value types" versus "reference types", Part 2 @ Tuesday, June 28, 2005 9:25 PM

Very Good!!! Thanks.

DeepICE

# Clear up reference and value type confusions @ Tuesday, July 26, 2005 8:38 AM

Ran into a good post recently which pointed out exactly what I
mentioned in my last post about the System.ValueType...

.NET and other stuff

# re: Referring to "value types" versus "reference types", Part 2 @ Wednesday, July 27, 2005 6:34 AM

All this work to make it seamless. And now, in C# 2.0, they are making valuetypes nullable.

Seems like a patch on a patch to me.

But what I wonder is: What is the *real* value of valuetypes? I know there are some optimizations in there, but wouldn't we be better of if there were no value type?

Thomas Eyde

New Comments to this post are disabled
Page view tracker