An internal email I read discussed the proper means to compare a System.Uri in .NET 2.0. When comparing Uri's, should you use Ordinal comparison or should you use InvariantCulture?
I paused while reading... Ordinal? What is that?
The first time I opened up FxCop on an assembly I wrote, I was slightly terrified. The first thing I noticed was the number of times it suggested I should specify a culture for string comparison. I knew a bit about cultures, typically being one of the few on my development teams advocating the use of resource files for apps that *may* be internationalized one day. But FxCop threw it squarely in my face: il8n can be a harder problem than aligning the UI elements correctly for different cultures, especially when business rules dictate validation and comparison of input. FxCop made me realize the number of small yet insidious il8n bugs that I could potentially create, even though I was thinking of some of the problems.
Back to comparing System.Uri's...
There is an article on MSDN, "New Recommendations for Using Strings in Microsoft .NET 2.0", that explains the motivation behind Ordinal comparison and the new members of the System.StringComparison enumeration in .NET 2.0. I admittedly glazed over, reading it the first time, but the third example in the section "The Motivation: The Turkish-I Problem" makes a world of sense... comparing InvariantCulture does not do a bit-wise comparison of the strings.
So, if you are comparing System.Uri's, do you compare using InvariantCulture (since the URI should be ASCII), or should you use Ordinal (given the guidance in "Choosing a StringComparison Member for Your Method Call" to use Ordinal for XML and HTTP)? The guidance from the owner of System.Uri indicated to just use Uri.Compare(). The prototype for Uri.Compare looks like:
public static int Compare(Uriuri1, Uriuri2, UriComponentspartsToCompare, UriFormatcompareFormat, StringComparisoncomparisonType);
Wow. What is UriFormat? The docs led to RFC 2396, which discuss escaping rules... so you likely want to pass SafeUnescaped when you need to compare Uri parts after the first instance of "#". But what do you pass for the StringComparison parameter? The owner of the API indicated that you should get the Uri parts and use Ordinal.
What I found even more interesting was the logic behind Uri.Compare when you pop it open using Reflector. If uri1 is not an absolute URI, and uri2 is an absolute URI, then you will always get a return of -1 despite the contents of the URI's themselves, lending weight towards absolute URIs. What is interesting is when you compare 2 Uri's with the same base, where uri1 is not absolute and uri2 is. For example, imagine your base URI is http://msdn2.microsoft.com:
uri1 = /library/system.xml.xmldocument
uri2 = http://msdn2.microsoft.com/library/system.xml.xmldocument
In this case, even though they are both pointing at the same thing, this would return a -1 for comparison. Should the Compare method automatically take into account the base Uri during comparison when the base is available?
Hopefully, I will have some blog code out within the next couple of weeks that shows why this is an interesting problem.