Immutability in C# Part One: Kinds of Immutability

Immutability in C# Part One: Kinds of Immutability

Rate This
  • Comments 54

I said in an earlier post that I believe that immutable objects are the way of the future in C#. I stand by that statement while at the same time noting that it is at this point sufficiently vague as to be practically meaningless! “Immutable” means different things to different people; different kinds of immutability have different pros and cons. I’d like to spend some time over the next few weeks talking about possible directions that C# could go to improve the developer experience when writing programs that use immutable objects, as well as giving some practical examples of the sort of immutable object programming you can do today.

(Again, I want to emphasize that in these sorts of “future feature” posts we are all playfully hypothesizing and brainstorming about ideas for entirely hypothetical future versions of C#. We have not yet shipped C# 3.0 and have not announced that there will ever be any future version of the language. Nothing here should be construed as any kind of promise or announcement; we’re just geeks talking about programming languages, ‘cause that’s what we do.)

So, disclaimers out of the way, what kinds of immutability are there? Lots. Here’s just a few. Note that these categories are not necessarily mutually exclusive!

Realio-trulio immutability:

There’s nothing you can do to the number one that changes it. You cannot paint it purple (‡), make it even or get it angry. It’s the number one, it is eternal, implacable and unchanging. Attempting to do something to it – say, adding three to it – doesn’t change the number one at all. Rather, it produces an entirely different and also immutable number. If you cast it to a double, you don’t change the integer one; rather, you get a brand new double.

Strings, numbers and the null value are all truly immutable.

C# allows you to declare truly immutable named fields with the const keyword. The compiler ensures that the only things that are allowed to go into const fields are truly immutable things – numbers, strings, null. (See the section of the standard on “constant expressions” for details.)

Write-once immutability:

Fields marked as const have to be compile-time constants, which is a bit of a pain if what you want to do is have a field which never changes but nevertheless cannot be computed until runtime. For example, in a later post I’m going to define an immutable stack class which has this code:

    public sealed class Stack<T> : IStack<T>
    {
        private sealed class EmptyStack : IStack<T>
        { /* ... */ }
        private static readonly EmptyStack empty = new EmptyStack();
        public static IStack<T> Empty { get { return empty; } }

I will want to create a singleton empty stack. Clearly it is not a compile-time constant, so I cannot make the field const. But I want to say “once this thing is initialized it is never going to change again.” That’s what the readonly modifier ensures. Basically it’s a “write only once” field. Not exactly immutable, since obviously it changes exactly once, from null to having a value. But pretty darn immutable.

Popsicle immutability:

...is what I whimsically call a slight weakening of write-once immutability. One could imagine an object or a field which remained mutable for a little while during its initialization, and then got “frozen” forever. This kind of immutability is particularly useful for immutable objects which circularly reference each other, or immutable objects which have been serialized to disk and upon deserialization need to be “fluid” until the entire deserialization process is done, at which point all the objects may be frozen.

There is at present no really universal convention for how to declare a freezable object, and there certainly is no support in the compiler for this kind of immutability.

Shallow vs deep immutability:

Consider a write-once field containing an array:

public class C {
    private static readonly int[] ints = new int[] { 1, 2, 3 };
    public static int[] Ints { get { return ints; } }

The value of the field cannot be changed; C.ints = null; would be illegal even from inside the class. This is a sort of “referential” immutability. But there is nothing immutable at all about the array itself! C.Ints[1] = 100; is still perfectly legal from outside the class.

The ints field is “shallowly” immutable. You can rely upon it being immutable to a certain extent, but once you reach a point where there is a reference to a mutable object, all bets are off.

Obviously the opposite of shallow immutability is “deep” immutability; in a deeply immutable object it is immutable all the way down.

If we had immutability in the type system, something like the far stronger kind of “const” in C/C++, then a hypothetical future compiler could verify that an object marked as deeply immutable had only deeply immutable fields.

Objects which are truly madly deeply immutable have a lot of great properties. They are 100% threadsafe, for example, since obviously there will be no conflicts between readers and (non-existant) writers. They are easier to reason about than objects which can change. But their strict requirements may be more than we need, or more than is practical to achieve.

Immutable facades:

Since the contents of an array (though, interestingly enough, not its size) may be changed arbitrarily, it’s a bad idea to expose data that you want to be logically read-only in a public array field. To make this a bit easier, the base class library lets you say

public class C {
    private static readonly intarray = new int[] { 1, 2, 3 };
    public static readonly ReadOnlyCollection<int> ints = new ReadOnlyCollection<int>(intarray);
    public static ReadOnlyCollection<int> Ints { get { return ints; } }

The read-only collection has the interface of a regular collection; it just throws an exception every time a method which would modify the collection is called. However, clearly the underlying collection is still mutable. Code inside C could mutate the array members.

Another down side of this kind of immutability is that the compiler is unable to detect attempts to modify the collection. Attempts to, say, add new members to the collection will fail at runtime, not at compile time.

This sort of immutability is a special case of...

Observational immutability:

Suppose you’ve got an object which has the property that every time you call a method on it, look at a field, etc, you get the same result. From the point of view of the caller such an object would be immutable. However you could imagine that behind the scenes the object was doing lazy initialization, memoizing results of function calls in a hash table, etc. The “guts” of the object might be entirely mutable.

What does it matter? Truly deeply immutable objects never change their internal state at all, and are therefore inherently threadsafe. An object which is mutable behind the scenes might still need to have complicated threading code in order to protect its internal mutable state from corruption should the object be called on two threads “at the same time”.

Summing up:

Holy goodness, this is complicated! And we have just barely touched upon the deeply complex relationship between immutability of objects and “purity” of methods, which opens up huge cans of worms.

So, smart people, what do you think? Are there forms of immutability which I did not touch upon here that you like to take advantage of in your programs? Are there any particular forms of immutability which you would like to see made easier to use in C#?

Next time: let’s get a little more practical. I already implemented an immutable stack in my A* series, but that was pretty special-purpose. We’ll take a look at how one might implement a general-purpose immutable stack today in C# 3.0. We'll then expand that to immutable queues, trees, etc. (And I might even discuss how one could take advantage of typesafe covariance when designing interfaces for immutable data structures, oh frabjous day!)

(‡) A dear old friend of mine from school who happens to be a grapheme-colour synaesthete tells me that of course you cannot paint the number one purple because it is already blue. Silly me!

  • JayBaz and I have thought about "Popsicle Immutability" quite a lot.  We decided that having two different forms of a type helped out here.  A mutable "Builder" class, and then an immutable version.  so you might have a class like:

    class Immutable {

       public class Builder {

           public int Prop { get; set; }

           public Immutable Realize() { return new Immutable(this.Prop); }

       }

       readonly int _prop;

       public int Prop { get { return this._prop; } }

       public Immutable(int prop)

       {

           this._prop = prop;

       }

    }

    Jay even went so far as to implement a code generator that would generate the two class using CodeDom I think.

  • Eric, if you and Joe Duffy haven't been talking to each other, you really should do. (I'd pay to hear good money that conversation, too - on practically any topic about C# or concurrency.) From a couple of days ago:

    http://www.bluebytesoftware.com/blog/2007/11/11/ImmutableTypesForC.aspx

    Also, I'd claim that String actually is only observationally immutable. StringBuilder manages to mutate it with no problems - it just makes sure that it never mutates a string which has already been publicised. At least, that's my understanding.

    Jon

  • Though of course I have been seeing email from him for years, I actually just met Joe for the first time a few days ago, coincidentally enough at a meeting where we were discussing immutability in C#/CLR.  

    I had no idea he was blogging about the same thing, thanks for pointing this out.

  • And regarding strings -- yes, you are right, I had not considered the magic that StringBuilder does behind the scenes. And indeed, there is all kinds of threading logic in a string builder to ensure that the actually-mutable underlying state of the string is always threadsafe and never observable by the caller.

  • Excellent post Eric, looking forward to part 2! I see Joe Duffy is also riffing on the subject of immutability right now. Clearly a hot topic with tech like ParallelFX in the wings.

  • I've, ahem, somehow managed to see some possible C# code for the string.Join method (how in the world could that have happened?).  A string is accessed by address after being pinned and written to directly.  This is completely acceptable, as there is no way for multiple threads to access the same mutating string; once it's out into the brutal, multithreaded world, it is immutable.  This is popsicle immutability.  Perhaps it would be possible to establish the freezing point and then have the compiler (or some tool) verify that pre-frozen code is thread-safe (or, at least make some guarantee).

    One type, or perhaps _usage_ of immutability would be to specify that a given variable (or parameter, field) isn't to be changed.  The discussion I have seen about this issue is that it isn't really worth it, that C++-style const functions and arguments ended up being a royal pain.  However, it seems to me that the excellent tooling support wouldn't be too hard to provide, so that you could propagate constness easily.  It would be pretty cool to be able to analyze code and show what objects are being modified where (this can be done) -- with const-support, one could show which objects are _allowed_ to be modified where.

  • In my thinking on this, I use two categories: immutability of interface vs. immutability of implementation.  The first is what consumers are looking for; the second makes it easier to accomplish the first.

    I also wrote about the idea of writing tools to help verify that you've accomplished the immutability you set out for: http://blogs.msdn.com/jaybaz_ms/archive/2004/06/10/152748.aspx

    And I finally wrote such a tool.  I wish I had published it in my blog before I left; I think that someone still at MS should do that (Kevin, it's in Framework\ImmutableAttribute.cs).  You put the attribute on classes that should be immutable.  At unit test time, there's code to reflect over all types in all your assemblies, looking for those with this attribute.  When it finds one, it verifies that the type meets the following rules:

    1. The base type meets the rules.

    2. The members are marked 'readonly'

    3. The types of the members meet the rules.

    In practice, this fails very quickly, as soon as you bring in type that you didn't write, so there are some special exceptions:

    1. Builtin types get a pass

    2. 'enum' types get a pass

    3. There's a whitelist of types in .Net Framework that are immutable, but lack the attribute.

    4. You can put the [Immutable] attribute on a _field_, indicating that the usage of this field is correct wrt. immutability, even though the type of the field isn't itself immutable.

    5. You can put [Immutable (OnFaith=true)] on a type to say that a type should be assume to meet the rules without further inspection.

    6. In your test you can pass in an additional whitelist of types that you don't own, aren’t part of .Net Framework, and are immutable.

    Now, an [Immutable] attribute  on a type tells the consumer that the type has Immutability of Interface.  It also helps the implementer to know they've made an immutable type.

    There's another test that requires that all structs in your assemblies have the [Immutable] attribute.  

    I wish that C# and .Net would help out. For example, if you could write 'readonly class C { … }' which would require at compile time that all members are marked 'readonly', and the members types and base types were "readonly types" - now you get deep immutability at compile time.  I also wish that the .Net Framework would mark its types as 'readonly' or '[Immutable]', and strive to make immutable types where possible, and refrain from providing mutable value types (*looks at 'Point'*).

    No, I never did write the code generator that Kevin mentions.  One reason is that CodeDom doesn't support 'readonly' on fields.  Doh.  Another is that I don't think you really want to rely on a code generator, with its own custom language, to define your types - even just the fields of your data types.  I think you want to use C# to do that.  Hopefully a future C# compiler will include a parser that I can program against; then you can write your own types, and I can create a 'Builder' class to go with it.

    I'm tempted to write the code generator, anyway. I think I'll do it in PowerShell, just because I <3 PowerShell.  One day...

  • The fact that ReadOnlyCollection<T> enforces its immutability at runtime rather than compile-time is just a detail of this class, not a general principle of Immutable Facades.  I would argue that Immutable Facades should generally be written to enforce immutability at compile-time.

  • Eric, I really *really* enjoy reading your blog! It's full a fascinating ideas and problems that I sometime didn't know of or sometime knew intuitively but would not have been able to express so clearly.

    The thing is that we are at the Jurassic time of computer languages. Languages should enable you to express your ideas, not get in the way. I find it so frustrating when the language gets in the way. Happened yesterday with the lack of operator constraints on generic types ;-)

    Back to immutability, yes it seems to be the hot topic 'du jour', might have something to do with the rise of multi-core processors, the popularity of Erlang and functional programming in general, etc.

    Beside Joe's great blog that was mentioned already, I would like to point out and 'old' post from Wes Dyer on the same topic: http://blogs.msdn.com/wesdyer/archive/2007/03/01/immutability-purity-and-referential-transparency.aspx

    If you are not convinced that mutations are bad, just look at the picture ;)

    Franck

  • something java has and which I quite like is immutability of local variables.

    In one case they make it quite clear and compile checked that it is written to once in the method call (I know the JIT doesn't need the hint to optimize the enregistration process itself)

    They allow closures to be more effectively controlled (though in theory static analysis can do this too) so that a more complex case can be made more simple (as they are effectively used in java where you must apply the keyword to make use of local variables in the limited closures provided)

    As a side note the variable created by foreach is a form of immutability...

  • Eric, this reminds me of the suggestion I posted in part 9, and I'm glad you found a way of partly answering my question without breaking your habits and just replying to one of my comments ;-)

    So you're thinking about it, but what do you think about the other suggestions that came with it?

    - Solving the array covariance problem, or at least providing a path to solving it, by

     a) providing warnings for cases where arrays are assigned in a covariant way to mutable array variables

     b) optionally elevating these warnings to errors, marking assemblies that do this as "safe", and moving type checks on array item assignments for safe assemblies from runtime to static analysis (peverify-level) (if that's possible)

    This would eliminate one of two implicit cast scenarios in c# I know of that can fail at runtime (foreach is the other one). I don't know about everyone else, but I would go to some lengths to spread "const" (or whatever) keywords over my sources (especially much used libraries) to get rid of this evil legacy.

    - eventually introducing IReadOnlyList/Collection interfaces in the BCL so that we do not depend solely on IsReadOnly and exceptions at runtime

    Your categorization of immutability is good, I don't have any additional insights to share. But if I were a language designer, I think I'd spend a good deal talking to people who have real-world experience with sandboxing scenarios and stuff like PLINQ. (I guess the discussions of C++ const are old news for you)

    BTW, while you're talking about immutable data structures, why not provide a Lisp-like list class that is divided into head and tail? They fit the recursive style of functional programming much better then IEnumerable, and I guess .First and .Skip(1) come with a lot of overhead (skip creates a new iterator for the GC to play with; for First, it depends), and while they certainly read better than car and cdr, I'd like head and tail so much better! I guess they'd also show better performance when concatenating immutable lists, because you can always add a new head to an existing immutable tail, without copying it. This is probably a trick we're going to see in your immutable stack implementation too, right? (unfortunately, implementing Tail for lists that are not actually implemented recursively is quite a drag, overhead-wise)

  • The only other type of immutability in my mind would be "variant immutablility" where the object is immutable to some actors, but totally mutable to others.  Something like the C++ friend declaration, but with limitations on what operations (read/write) can be performed on which member variables by which "friends".

  • re: Immutable facades

    I agree with mharder (and Stefan) here.  Given that both C# and Java do it this way (wrapper classes that enforce immutability at run-time), is there a reason that providing e.g. IConstList and IList : IConstList is more difficult than it seems?

  • The problem with IConstList is that all the functions that are written for IList need to then be rewritten to work for IConstList, unless IConstList is derived from IList, in which case you can only check it at run-time.

    C++ has this problem bad.  The const keyword is a virus.  Once you use it in one place you find a bunch of functions where you want to pass the const variable, but the function didn't use the const keyword on the argument so the function has to be modified, in many cases by just adding the const keyword.

    C# could help solve this problem by analyzing each function and deciding which arguments are constable, that is, which arguments are not marked const, but are none-the-less never modified and therefore passing a const variable in would be safe.

    The key to making const more popular than it was in C++ is to make it less infectious.

  • But doesn't const have to be infectious or it loses its power?

    The reason it gets ugly in C/C++ codebases is there there is lots of legacy code which was pretty loosey-goosey with const-correctness. Because it's possible to cast-away constness in C/C++, people often choose that as the solution to interfacing with these codebases. (Hey, it makes the compiler stop complaining, right?)

    With C# and the .Net framework, there was an opportunity to build robust concepts of immutability right into the foundations from the get-go. I now fear that this opportunity has been lost and there is too much code out there already (notably the existing framework classes) that new code will have to interface with.

Page 1 of 4 (54 items) 1234