Here’s another new obfuscation technique I’ve seen in use on malicious web sites recently. Check out the following HTML:<html><meta http-equiv=content-type content='text/html; charset=us-ascii'></head><body>¼óãòéðô¾áìåòô¨¢Ôèéó éó óïíå ïâæõóãáôåä óãòéðô¡¢©»¼¯óãòéðô¾</body></html> Those funny characters are actually standard ASCII characters with the high-bit of each byte set. If the high-bit ASCII managed to get posted properly to this blog without getting mangled, you should be able to drop the obfuscated HTML into a file on a web server and observe that browsing to the file results in execution of the following script:<script>alert("This is some obfuscated script!");</script>Here’s some quick and dirty C# code that will clear the high-bit of each input byte: int char1; Char c1; FileStream fs = new FileStream([file path], FileMode.Open); BinaryReader r = new BinaryReader(fs); r.BaseStream.Seek(0, SeekOrigin.Begin); while (r.BaseStream.Position < r.BaseStream.Length) { char1 = r.ReadByte(); char1 = char1 - 0x80; c1 = (Char)char1; Console.Write(c1); } Drop this code into a console app and you’ll have a nice de-obfuscator. This interesting behavior of US-ASCII in IE was noted by Kurt Huwig on BugTraq a few months ago.