Well, Jochen Neyens asked:

What's the easiest way to remove diacritic marks from characters using C#? I would like to have following function:

string RemoveDiacriticMark(string c)

Sample use:

RemoveDiacriticMark("é") -> "e"

RemoveDiacriticMark("ü") -> "u"

RemoveDiacriticMark("à") -> "a"

Well, there is not really an easy way to do it until Whidbey, but with Whidbey you can use normalization and Unicode character properties (discussed previously in FoldString.NET? No, but Whidbey has Normalization (which is kinda more cooler) and A little bit about the new CharUnicodeInfo class) to build something simple to do it all!

WARNING: This code has been improved! Get the improved version from this other post.

namespace Remove {
  using System;
  using System.Text;
  using System.Globalization;
  class Remove {
    [STAThread]
    static void Main(string[] args) {
      foreach(string st in args) {
        Console.WriteLine(RemoveDiacritics(st));
      }
    }

    static string RemoveDiacritics(string stIn) {
      string stFormD = stIn.Normalize(NormalizationForm.FormD);
      StringBuilder sb = new StringBuilder();

      for(int ich = 0; ich < stFormD.Length; ich++) {
        UnicodeCategory uc = CharUnicodeInfo.GetUnicodeCategory(stFormD[ich]);
        if(uc != UnicodeCategory.NonSpacingMark) {
          sb.Append(stFormD[ich]);
        }
      }

      return(sb.ToString());
    }
  }
}

Just put it in a file (remove.cs), compile it in Whidbey:

c:\temp\samples>csc remove.cs

and then run it!

c:\temp\samples>remove âãäåçèéêë ìíîïðñòó ôõöùúûüý
aaaaceeee
iiiiðnoo
ooouuuuy

Now in prior versions your options are more limited, though a p/invoke to the FoldString API with the MAP_COMPOSITE flag. There is also no CharUnicodeInfo class for information on Unicode properties, but you could also use a regular expression (using :Mn will give you the equivalent category). I will leave doing the regular expression as an exercise for the reader....

Enjoy!

This post brought to you by "û" (U+00fb, a.k.a. LATIN SMALL LETTER U WITH CIRCUMFLEX)