Debunking another myth about value types

Debunking another myth about value types

Rate This
  • Comments 40

Here's another myth about value types that I sometimes hear:

"Obviously, using the new operator on a reference type allocates memory on the heap. But a value type is called a value type because it stores its own value, not a reference to its value. Therefore, using the new operator on a value type allocates no additional memory. Rather, the memory already allocated for the value is used."

That seems plausible, right? Suppose you have an assignment to, say, a field s of type S:

s = new S(123, 456);

If S is a reference type then this allocates new memory out of the long-term garbage collected pool, a.k.a. "the heap", and makes s refer to that storage. But if S is a value type then there is no need to allocate new storage because we already have the storage. The variable s already exists and we're going to call the constructor on it, right?

Wrong. That is not what the C# spec says and not what we do. (Commenter Wesner Moise points out that yes, that is sometimes what we do. More on that in a minute.)

It is instructive to ask "what if the myth were true?" Suppose it were the case that the statement above meant "determine the memory location to which the constructed type is being assigned, and pass a reference to that memory location as the 'this' reference in the constructor". Consider the following class defined in a single-threaded program (for the remainder of this article I am considering only single-threaded scenarios; the guarantees in multi-threaded scenarios are much weaker.)

using System;
struct S
    private int x;
    private int y;
    public int X { get { return x; } }
    public int Y { get { return y; } }
    public S(int x, int y, Action callback)
        if (x > y)
            throw new Exception();
        this.x = x;
        this.y = y;

We have an immutable struct which throws an exception if x > y. Therefore it should be impossible to ever get an instance of S where x > y, right? That's the point of this invariant. But watch:

static class P
    static void Main()
        S s = default(S);
        Action callback = ()=>{Console.WriteLine("{0}, {1}", s.X, s.Y);};
        s = new S(1, 2, callback);
        s = new S(3, 4, callback);

Again, remember that we are supposing the myth I stated above to be the truth. What happens?

* First we make a storage location for s. (Because s is an outer variable used in a lambda, this storage is on the heap. But the location of the storage for s is irrelevant to today's myth, so let's not consider it further.)
* We assign a default S to s; this does not call any constructor. Rather it simply assigns zero to both x and y.
* We make the action.
* We (mythically) obtain a reference to s and use it for the 'this' to the constructor call. The constructor calls the callback three times.
* The first time, s is still (0, 0).
* The second time, x has been mutated, so s is (1, 0), violating our precondition that X is not observed to be greater than Y.
* The third time s is (1, 2).
* Now we do it again, and again, the callback observes (1, 2), (3, 2) and (3, 4), violating the condition that X must not be observed to be greater than Y.

This is horrid. We have a perfectly sensible precondition that looks like it should never be violated because we have an immutable value type that checks its state in the constructor. And yet, in our mythical world, it is violated.

Here's another way to demonstrate that this is mythical. Add another constructor to S:

    public S(int x, int y, bool panic)
        if (x > y)
            throw new Exception();
        this.x = x;
        if (panic)
            throw new Exception();
        this.y = y;

We have

static class P
    static void Main()
        S s = default(S);
            s = new S(1, 2, false);
            s = new S(3, 4, true);
        catch(Exception ex)
            Console.WriteLine("{0}, {1}", s.X, s.Y);};

Again, remember that we are supposing the myth I stated above to be the truth. What happens? If the storage of s is mutated by the first constructor and then partially mutated by the second constructor, then again, the catch block observes the object in an inconsistent state. Assuming the myth to be true. Which it is not. The mythical part is right here:

Therefore, using the new operator on a value type allocates no additional memory. Rather, the memory already allocated for the value is used.

That's not true, and as we've just seen, if it were true then it would be possible to write some really bad code. The fact is that both statements are false. The C# specification is clear on this point:

"If T is a struct type, an instance of T is created by allocating a temporary local variable"

That is, the statement

s = new S(123, 456);

actually means:

* Determine the location referred to by s.
* Allocate a temporary variable t of type S, initialized to its default value.
* Run the constructor, passing a reference to t for "this".
* Make a by-value copy of t to s.

This is as it should be. The operations happen in a predictable order: first the "new" runs, and then the "assignment" runs. In the mythical explanation, there is no assignment; it vanishes. And now the variable s is never observed to be in an inconsistent state. The only code that can observe x being greater than y is code in the constructor. Construction followed by assignment becomes "atomic"(*).

In the real world if you run the first version of the code above you see that s does not mutate until the constructor is done. You get (0,0) three times and then (1,2) three times. Similarly, in the second version s is observed to still be (1,2); only the temporary was mutated when the exception happened.

Now, what about Wesner's point? Yes, in fact if it is a stack-allocated local variable (and not a field in a closure) that is declared at the same level of "try" nesting as the constructor call then we do not go through this rigamarole of making a new temporary, initializing the temporary, and copying it to the local. In that specific (and common) case we can optimize away the creation of the temporary and the copy because it is impossible for a C# program to observe the difference! But conceptually you should think of the creation as a creation-then-copy rather than a creation-in-place; that it sometimes can be in-place is an implementation detail that you should not rely upon.


(*) Again, I am referring to single-threaded scenarios here. If the variable s can be observed on different threads then it can be observed to be in an inconsistent state because copying any struct larger than an int is not guaranteed to be a threadsafe atomic operation.


  • Structs are for the most part a waste of time, and I fail to see the much of their use.

    Okay I'm being unkind. Is it me, or do structs get elevated status solely from "A class is a reference type. A struct is a value type. This is different because .... ", and that leads one erroneously to think that a struct is on equal footing with a class. To the untrained eye structs _seem_ to be like classes in so very many ways.

    To be blunt (and unkind again) in an OO language, structs are C hangovers. Necessary, but nobody talks about int or a double as much as they talk about structs.

    benefit: I can allocate 1,000,000 structs faster than I can allocate 1,000,000 objects. Nifty (I guess)

    point of difference: structs get automatically shallow copied on assignment, or when passed as parameters.

    I supposed my real gripe is only that structs can contain references, and if I had access to the spec back in the day, I would have snuck in "Structs containing references should be outlawed" or just purloined the page outright. Can you spot the issue below? Do you know the output? There's nothing magical or tricky, but this code has different semantics if s is a class versus s as a struct.

    S s = new S();

    s.SA = 123;

    s.SC = new C { CA = 12, CB = 23 };

    S prime = s;

    prime.SA = 999;

    prime.SC.CA = 222;

    I'm not beating on Eric or the compiler team here, or on anyone really. It just goads me that the only difference between class C and struct S is that the former doesn't have (shallow) copy on assignment semantics. e.g. structs have automagic IClonable MemberwiseClone() built-in (will all the lovely defects of the shallow-copy).

    I'm not going to too ornery about it, but the subtle defect is discernible by us, but only on close inspection, and is (mostly)  completely lost on a beginner. In fact I'm so used to by-reference semantics, that I will fail to notice that s.SA != prime.SA. I'd also wager that most of the people I've worked with would never spot the difference especially when buried under 1000000 lines of other code. I don't hover my cursor over every variable and check class vs struct, and I certainly don't code review every line of code checked in. Gosh I'm glad I use NUnit!

    I like source-code trivia. It's a fun past time, and I can impress my co-workers. But in reality, it's sometimes a bad thing that this is a fact in the first place. The fact I can subtly break a program by changing 'struct' to 'class' in a declaration is hard to swallow. The compiler doesn't complain because it's still a legal program. I already know Eric's arguments against adding warnings to the compiler, but anytime there's a reference type inside a struct, and that reference type doesn't have value-type characteristics (e.g. strings are fine) I want the compiler to sternly scold me for being downright "dangerous". (Dangerous is what the kids these days like to call code that has the potential to shoot someone in the foot, but then kids these days go a bit over the top)

    Structs, when used correctly, are fine. I really don't have a beef with them except they're too permissive, and they're allowed to act too much like classes, except for all these really small caveats. It's so easy to add a reference type to a struct that you probably don't notice the error when you make it. Value with reference types and references with value types. What is a beginner to think? Stack vs Heap is really the least of their concerns. Truly. in C#, storage is somewhat moot. I don't recall any python or other languages that care so much where in memory a variable happens to be stored. Yes it matters in C++, but not so much in C#.

    Unless required for some sort of interop, I mostly avoid structs and make immutable reference types instead. At the end of the day all I lose is some automagic copy semantics, but I can quite easily craft up an immutable reference type, so I don't really care if it's copied by value or by reference. When's the last time you really noticed a difference between a string and an int. Both have value type semantics which is what matters in the end.

    PS Thanks for another good post. :-)

  • What's wrong with structs that contain references? Not only are popular types like BigInteger and KeyValuePair implemented as structs that contain references, so are innumerable Enumerator types for the collections you use every day.

    As for whether you should care about Stack vs Heap, I believe that's Eric's point -- you shouldn't care. I don't think I've ever had to -- that's the BCL authors' jobs.

    When should you make your own value types? Almost never. Your average client/database/server app developer can probably go their whole career without having to make one. And when you do, they should probably be immutable. Note that your typical Enumerator is a mutable struct that contains references.

    So what about other languages that don't care to make the distinction? Look at Java: It has a few built-in value types (ints, floats, etc.) but you can't create your own. That means you can have an array of 1000000 doubles and it will take 8MB, but if you want an array of Complex (2 doubles), you need to create a 4MB (or 8MB) array of references, then allocate a million 16-byte objects, each with 8 (or 16) bytes of overhead. The solution is usually to create two separate arrays (one for reals and one for imaginaries), then rewrite all the code to use the hack.

    Python is even worse -- it has no value types whatsoever. Every int, float, or complex you create is a new object on the heap with all the overhead that entails. It's so bad there are numerous C modules (like NumPy) to allow you to create and manipulate arrays of primitive numeric types.

    In other words, if C# didn't have value types, you'd be stuck having to use hacks like creating them in C when you really do need them.

  • ficedula: I'm not sure that garbage collection always happens when memory is fresh in the CPU cache. You'd need a damn good GC for that...

    You can always write data in blocks of 16 bytes. Suppose your struct has 5 bytes - simply concatenate the first 16 copies, and then you've got an 80-byte block you can blit all over the place - or does quick blitting require that your 16-byte blocks are all identical?

  • configurator: The idea is that when you garbage collect, the objects that get collected will be touched as part of the collection process. (At least, based on my understanding of the way the .NET GC works). After marking all the objects that are 'live' the GC then compacts the live objects - as part of this process it'll be walking over all the objects [in the generation being collected] which will generally bring that memory into the CPU cache. Zeroing it at that point is potentially cheap since you're touching it anyway. It's not that the GC only runs when the memory is already in the cache - but that the process of GC will *bring* it into the cache anyway [so take advantage of that to clear it out].

    (I have no idea whether the .NET GC actually does preclear the memory at this point; but it seems like a logical option that they might have investigated. The costs might have turned out not to be worthwhile under real world conditions.)

    Writing data in blocks of 16-bytes: right, but your pattern is now 80 bytes long. Older x86 machines don't *have* 80 bytes of registers available for general copying of data! Newer x86 and x86-64 machines do in the form of SSE registers, but then you ideally need to be writing data to 16-byte aligned destinations, and you're taking up more registers. Writing zeros just means clearing one SSE register and then blasting those 16 bytes out again and again.

    (If I had to write a runtime for a language that supported default values for value types I'd certainly look at doing all these sort of things. I'd probably prefer to be writing a runtime for a system that just allowed me to zero everything out though...!)

  • @ficedula: I haven't touched assembly in a while, but I remember there being a call that did something like memset for large areas of memory, with any given piece of data. I could be wrong though.

  • @configurator: There's REP STOSD - but that only sets 4-bytes at a time, so you can set a 4-byte pattern, but no larger. x86-64 has REP STOSQ which writes 8-bytes at a time, but again, you can only set an 8-byte pattern. Great for setting a region of memory to zero (or all 0xffffffff, or 0x80808080, or whatever), but no use for setting any larger pattern. In order to set larger patterns, you need to hold the pattern in registers and write your own loop to copy each register into memory in turn. Your pattern still has to fit into available registers.

    (You can also use REP MOVSD to copy a region of memory from one place to another, but (a) that's slower, because it's memory-to-memory rather than register-to-memory, and (b) To copy an 80-byte pattern 1000 times over into an 80000 byte region, you'd need to set up and call REP MOVSD 1000 times ... and have your 80-byte pattern set up in memory first as the copy source.)

    (On modern PCs, it turns out that while the x86 string instructions (REP MOVS/STOS) are very compact - one instruction to blast as much data as you want from place to place - they're actually not as fast as using the SSE registers which can move 16-bytes at a time and can also be given cache hints.)

  • configurator & ficedula: Going on about how to initialize arrays with constant bit patterns is pointless, because it's not something you would likely want. If you didn't have to have the all-0 default constructor for value types you would want to have your own constructor called for each instance (like in C++), not have some arbitrary bit pattern copied to each element.

  • @Gabe there's a difference between confusing specs (IMO structs with embedded reference types) and some functionality that takes advantage of the quirks in a confusing spec. The latter, as great as they may be, does not excuse the former. There are innumerable examples of creative code that takes advantage of a language defect, but so what?

    Or are you implying the main driver of this 'feature' is to enable enumerators to work a little better?

    In any event, if you read my comment, it was:

    reference types - check

    value types without references - check

    value types with references - wtf?

    @ficedula ...  since this is conjecture about stuff we know nothing of, I'd wager memory isn't cleared until it's requested. The GC, we're told, is meant to be as quick as possible, and zeroing memory that may never be used by the program again doesn't seem like a good use of time. I'd file it under YAGNI (You Aint Gonna Need It).

  • @Gabe: The use-case would be that you could set defaults for all the fields which maintained sensible invariants that weren't necessarily based on all zero values ... even if for the struct to be used 'for real' you then initialised the fields in a constructor to more 'useful' values. I'd agree that this isn't beneficial *enough* to justify all the effort needed to implement the feature though.

    @Just a guy: You could well be right; I'm just speculating on a possible optimisation. I'd expect the CLR team to have thought about it and decided to implement or not based on how it effects real-world workloads ... possibly it's just not worth doing, period.

  • I just wonder if this is way overboard...I mean at a top level isn't that enough for typical development?

Page 3 of 3 (40 items) 123