What curious property does this string have?

What curious property does this string have?

Rate This
  • Comments 33

There are all kinds of interesting things in the Unicode standard. For example, the block of characters from U+A000 to U+A48F is for representing syllables in the "Yi script". Apparently it is a Chinese language writing system developed during the Tang Dynasty.

A string drawn from this block has an unusual property; the string consists of just two characters, both the same: a repetition of character U+A0A2:

string s = "ꂢꂢ";

Or, if your browser can't hack the Yi script, that's the equivalent of the C# program fragment:

string s = "\uA0A2\uA0A2";

What curious property does this string have?

I'll leave some hints in the comments, and post the answer next time..

UPDATE: A couple people have figured it out, so don't read the comments too far if you don't want to be spoiled. I'll post a follow-up article on Friday.

  • @Ramon

    GetHashCode in general can only express inequality withing a single AppDomain.

    @iCe

    Code usually relies on a low number of collisions to get a large performance gain. But any code that relies on `GetHashCode()` being unique for correctness is broken.

  • Have there been any DOS attacks on IIS by passing in a bunch of these strings of varying lengths? Non-uniform hashes sometimes lead to nasty problems like that if data structures rely on a good distribution of hash values.

  • @Alex

    Yes! http://arst.ch/rz0

    Remembered this post while reading the above article.

Page 3 of 3 (33 items) 123