Welcome to MSDN Blogs Sign in | Join | Help

High-bit ASCII obfuscation

Here’s another new obfuscation technique I’ve seen in use on malicious web sites recently.  Check out the following HTML:

<html><meta http-equiv=content-type content='text/html; charset=us-ascii'></head><body>¼óãòéðô¾áìåòô¨¢Ôèéó éó óïíå ïâæõóãáôåä óãòéðô¡¢©»¼¯óãòéðô¾</body></html> 

Those funny characters are actually standard ASCII characters with the high-bit of each byte set.  If the high-bit ASCII managed to get posted properly to this blog without getting mangled, you should be able to drop the obfuscated HTML into a file on a web server and observe that browsing to the file results in execution of the following script:

<script>alert("This is some obfuscated script!");</script>

Here’s some quick and dirty C# code that will clear the high-bit of each input byte:

       int char1;
       Char c1;
       FileStream fs = new FileStream([file path], FileMode.Open);
       BinaryReader r = new BinaryReader(fs);
 
       r.BaseStream.Seek(0, SeekOrigin.Begin);
 
       while (r.BaseStream.Position < r.BaseStream.Length)
       {
           char1 = r.ReadByte();
           char1 = char1 - 0x80;
           c1 = (Char)char1;
           Console.Write(c1);
       }
 
Drop this code into a console app and you’ll have a nice de-obfuscator.
 
This interesting behavior of US-ASCII in IE was noted by Kurt Huwig on BugTraq a few months ago.

Published Sunday, October 01, 2006 8:28 PM by dross

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

No Comments

Leave a Comment

(required) 
required 
(required) 
 
Page view tracker