For the past year, as the whole world and its dog have blogged themselves onto the NY Times bestseller list (and in many cases off it too), I have gazed at my navel, and wondered if the lint there is the same color the world over.


Then today, I read Mathew Wilson’s piece “Identity & Equality In .NET” (Dr Dobb’s Journal Windows supplement, June 2004) .  Which caused me to sit up, hastily button my shirt (yes some of us at Microsoft do wear real shirts) and seek out a blog-editor – just so I could summarize what Mathew said for all non DDJ readers.


Like most (all?) C# programmers, I’ve undergone the baptism ritual of over-riding object.Equals.  Having done that once, I now keep Jay Bazuzi’s Equals-in-a-box in a safe place (alongside Kevin’s addition) [See the full list of guidelines at MSDN, including the guidance for operator ==]


Here’s what Jay gives to those who say the magic word (It is pretty much the same as the code sample on MSDN - the Visual Studio formatting is just free publicity.) 

// override object.Equals
public override bool Equals(object obj)
    if (obj == null || GetType() != obj.GetType())
            return false;
    // TODO: write your implementation of Equals() here.
    throw new System.NotImplementedException();
    return base.Equals(obj); 

And for completeness here is operator == and GetHashCode()

bool operator ==(SomeType o1, SomeType o2)
    return Object.Equals(o1, o2);
// override object.GetHashCode
public override int GetHashCode()
    // TODO: write your implementation of GetHashCode() here.
    throw new System.NotImplementedException();
    return base.GetHashCode();


These are pretty much the canonical forms of the overrides/overloads


Now, if you expect that you are going to be testing equality a bunch of times (sorry for the jargon), you may be tempted to see how fast you can make your code.  [Sadly, omerta prevents me from telling you the real reason why developers flog their code to the extreme.]


So, starting from this point, Mathew (who seems to have a penchant for seeing how fast something will go) goes on to systematically derive his optimal versions.  He then did some simple performance comparisons.  It was the performance differences that caused me to sit up so sharply.


In Jay’s over-ride of object.Equals(), above, if you replace the calls to GetType() with our own beloved as operator, like so

//if (obj == null || GetType() != obj.GetType())
if (obj == null || Object.ReferenceEquals(obj as SomeType, null))
        return false;

Mathew says you can knock off about 60% of the original cost. 


Now admittedly the semantics of this version is modified ever so slightly, as the as will return non-null if obj is derived from SomeType.  However, typically that is what you want, isn’t it.


I’ll quickly summarize the rest of Mathews findings just to underline the significance of the 60%.




        Object.ReferenceEquals(obj as SomeType, null) 


        ((Object) obj as SomeType)==null) 

netted about 4%.



And all the effort of working around the virtual call to Equals in the operator == overload, and narrowly surviving two potential gotcha’s, to end up at

public static bool operator ==(SomeType o1, SomeType o2)
    //return Object.Equals(o1, o2);
    return (Object)o1 == null ? (Object)o2 == null : o1.Equals(o2);

Shaved off just 2.5%.  Hardly worth the potential errors.



Ordering the member comparisons with fastest comparisons first (in Mathews case ints before strings) took off a respectable 18%.



All told, Mathew was able to knock off 80% from his original canonical form. 


Just to convince myself, I ran his suite on both the Everett and Whidbey Community Preview versions of Visual Studio.  In my case, as knocked off only 27 to 37% of the execution time, while reordering member comparisons contributed 25 to 39%.  In both versions, combining all the tweaks reduced execution time by 65% (YMMV - drive safely, wear your seatbelt, and remember to override GetHashCode).


All this give me new respect for the isinst MSIL instruction, and underlines something I hear time and again.  Which is “beware of premature optimization, and don’t optimize until you know where the bottleneck/s are.”