Remove any characters that are not alphanumeric.


To remove these characters, we will first need to match them. We know that to match all alphanumeric characters, we could write:


To match all characters except these, we can negate the character class:


It's then simple to use Regex.Replace():

string data = ...;

Regex regex = new Regex("[^a-zA-Z0-9]");

data = Regex.Replace(data, "");

Another way of doing this would be to use the pattern:


and then create the regex using RegexOptions.CaseInsensitive.

Note: I've seen a few comments referring to Unicode and international characters. I haven't delved into that because I don't want to complicate the discussion, and, frankly, Unicode scares me. If you want the details, you can find them in the docs. For example, you can find out that \W is really equivalent to the Unicode categories [^\p{Ll}\p{Lu}\p{Lt}\p{Lo}\p{Nd}\p{Pc}].