Fabulous Adventures In Coding
Eric Lippert is a principal developer on the C# compiler team. Learn more about Eric.
I blogged a while back about how “references” are often described as “addresses” when describing the semantics of the C# memory model. Though that’s arguably correct, it’s also arguably an implementation detail rather than an important eternal truth. Another memory-model implementation detail I often see presented as a fact is “value types are allocated on the stack”. I often see it because of course, that’s what our documentation says.
Almost every article I see that describes the difference between value types and reference types explains in (frequently incorrect) detail about what “the stack” is and how the major difference between value types and reference types is that value types go on the stack. I’m sure you can find dozens of examples by searching the web.
I find this characterization of a value type based on its implementation details rather than its observable characteristics to be both confusing and unfortunate. Surely the most relevant fact about value types is not the implementation detail of how they are allocated, but rather the by-design semantic meaning of “value type”, namely that they are always copied “by value”. If the relevant thing was their allocation details then we’d have called them “heap types” and “stack types”. But that’s not relevant most of the time. Most of the time the relevant thing is their copying and identity semantics.
I regret that the documentation does not focus on what is most relevant; by focusing on a largely irrelevant implementation detail, we enlarge the importance of that implementation detail and obscure the importance of what makes a value type semantically useful. I dearly wish that all those articles explaining what “the stack” is would instead spend time explaining what exactly “copied by value” means and how misunderstanding or misusing “copy by value” can cause bugs.
Of course, the simplistic statement I described is not even true. As the MSDN documentation correctly notes, value types are allocated on the stack sometimes. For example, the memory for an integer field in a class type is part of the class instance’s memory, which is allocated on the heap. A local variable is hoisted to be implemented as a field of a hidden class if the local is an outer variable used by an anonymous method(*) so again, the storage associated with that local variable will be on the heap if it is of value type.
But more generally, again we have an explanation that doesn’t actually explain anything. Leaving performance considerations aside, what possible difference does it make to the developer whether the CLR’s jitter happens to allocate memory for a particular local variable by adding some integer to the pointer that we call “the stack pointer” or adding the same integer to the pointer that we call “the top of the GC heap”? As long as the implementation maintains the semantics guaranteed by the specification, it can choose any strategy it likes for generating efficient code.
Heck, there’s no requirement that the operating system that the CLI is implemented on top of provide a per-thread one-meg array called “the stack”. That Windows typically does so, and that this one-meg array is an efficient place to store small amounts of short-lived data is great, but it’s not a requirement that an operating system provide such a structure, or that the jitter use it. The jitter could choose to put every local “on the heap” and live with the performance cost of doing so, as long as the value type semantics were maintained.
Even worse though is the frequently-seen characterization that value types are “small and fast” and reference types are “big and slow”. Indeed, value types that can be jitted to code that allocates off the stack are extremely fast to both allocate and deallocate. Large structures heap-allocated structures like arrays of value type are also pretty fast, particularly if you need them initialized to the default state of the value type. And there is some memory overhead to ref types. And there are some high-profile cases where value types give a big perf win. But in the vast majority of programs out there, local variable allocations and deallocations are not going to be the performance bottleneck.
Making the nano-optimization of making a type that really should be a ref type into a value type for a few nanoseconds of perf gain is probably not worth it. I would only be making that choice if profiling data showed that there was a large, real-world-customer-impacting performance problem directly mitigated by using value types. Absent such data, I’d always make the choice of value type vs reference type based on whether the type is semantically representing a value or semantically a reference to something.
UPDATE: Part two is here
(*) Or in an iterator block.
> WHAT these types are doing in memory is in fact WHY a Value Type or an Object Type behaves the way it does.
I think your logic is reversed here. As Eric has pointed out above, it's the observe behavior of value types that is primary, and their implementation (including "what they are doing in memory") is dictated by that.
This leads to a question I've had about the .Net model model.
We use the terms "stack" & "heap" because back in the 1980s, a common C compiler/library organized local data using a stack data structure, while it stored malloc()ed blocked using a heap data structure.
So, the question is, in the current .NET implementation, is the Stack still a stack, and is the Heap still a heap?
In the last post , we’ve seen that one of the overloads used for creating a thread receives a parameter
One thing I never understood about Microsoft is why you have employees (like yourself) who blog about problems, but never fix those problems. In this case, you highlighted a significant shortcoming in the MSDN documentation of System.ValueType – if this shortcoming is known, why doesn't it get fixed?
Great post! In my experience, the only time where choosing struct vs class really paid off was in solving an extremely high Garbage Collection latency problem. Our application's original design built hundreds of binary trees that were constructed using reference-type nodes . As the trees began to get more and more populated (several millions of nodes each), the garbage collector demanded more and more cpu (keeping track and compacting for so many instances was evidently becoming prohibitely expensive). The application lockups were of several seconds, -totally unacceptable -when full garbage collections were performed. So we switched and implemented our trees using Lists of structs, with left and right references becoming indexes to the Lists (we knew the implementation detail that a List collection for a value type will actually create a dynamic array directly holding the values). This change dramatically reduced the number of object references to be tracked and completely removed the lockups. We also consumed a lot less memory since node indexes were 4 bytes rather than 8 byte references in our 64-bit environment!