Welcome to MSDN Blogs Sign in | Join | Help

David Coe

Notes from the field
Cracking Open the Microsoft Anti Cross Site Scripting Library

Microsoft just released its first version of the Anti-Cross Site Scripting Library V1.0

Irena Kennedy briefly blogged about the differences between the System.Web.HttpUtility.HtmlEncode and the HtmlEncode function found in the XSS library.  I want to examine this a bit further.  Using Lutz Roeder's Reflector, let's crack open each method and look at the specific differences.

The HttpUtility.HtmlEncode(string) method internally calls the HtmlEncode(string, TextWriter) call.  That method is plotted below:

public static unsafe void HtmlEncode(string s, TextWriter output)
{
      if (s != null)
      {
            int num1 = HttpUtility.IndexOfHtmlEncodingChars(s, 0);
            if (num1 == -1)
            {
                  output.Write(s);
            }
            else
            {
                  int num2 = s.Length - num1;
                  fixed (char* local1 = s)
                  {
                        char* chPtr1 = local1;
                        char* chPtr2 = chPtr1;
                        while (num1-- > 0)
                        {
                              chPtr2++;
                              output.Write(chPtr2[0]);
                        }
                        while (num2-- > 0)
                        {
                              chPtr2++;
                              char ch1 = chPtr2[0];
                              if (ch1 > '>')
                              {
                                    goto Label_00C4;
                              }
                              char ch2 = ch1;
                              if (ch2 != '"')
                              {
                                    if (ch2 == '&')
                                    {
                                          goto Label_00AD;
                                    }
                                    switch (ch2)
                                    {
                                          case '<':
                                          {
                                                output.Write("&lt;");
                                                continue;
                                          }
                                          case '=':
                                          {
                                                goto Label_00BA;
                                          }
                                          case '>':
                                          {
                                                output.Write("&gt;");
                                                continue;
                                          }
                                    }
                                    goto Label_00BA;
                              }
                              output.Write("&quot;");
                              continue;
                        Label_00AD:
                              output.Write("&amp;");
                              continue;
                        Label_00BA:
                              output.Write(ch1);
                              continue;
                        Label_00C4:
                              if ((ch1 >= '\x00a0') && (ch1 < 'A'))
                              {
                                    output.Write("&#");
                                    int num3 = ch1;
                                    output.Write(num3.ToString(NumberFormatInfo.InvariantInfo));
                                    output.Write(';');
                                    continue;
                              }
                              output.Write(ch1);
                        }
                  }
            }
      }


As you can see, the function essentially replaces brackets, ampersands, and not much else. 

Now lets take a look at what the AntiXSSLibrary.HtmlEncode(string) method looks like:

public static string HtmlEncode(string s)
{
      if (s == null)
      {
            return string.Empty;
      }
      StringBuilder builder1 = new StringBuilder(string.Empty, s.Length * 2);
      string text1 = s;
      for (int num1 = 0; num1 < text1.Length; num1++)
      {
            char ch1 = text1[num1];
            if ((((ch1 > '`') && (ch1 < '{')) || ((ch1 > '@') && (ch1 < '['))) || (((ch1 == ' ') || ((ch1 > '/') && (ch1 < ':'))) || (((ch1 == '.') || (ch1 == ',')) || ((ch1 == '-') || (ch1 == '_')))))
            {
                  builder1.Append(ch1);
            }
            else
            {
                  int num2 = ch1;
                  builder1.Append("&#" + num2.ToString() + ";");
            }
      }
      return builder1.ToString();
}


A few things to note here.  The first is that this method call is MUCH more compact that its cousin.  The second is that this method only ALLOWS certain characters to be present in the text.

Posted: Tuesday, March 07, 2006 3:58 PM by dcoe
Filed under:

Comments

No Comments

Anonymous comments are disabled
Page view tracker