Well, just in time for a wave of referral spam that is hitting my blog (mostly from http://www.ownsthis.com) I spent part of today writing a class that can consume the Movable Type Blacklist. The class will allow you to download this file from the server periodically (no more than once a day). I have written it such that anyone can integrate this into their .Net blogging package, or any other .Net program. I just checked this into the dasBlog 1.7 tree. The nice thing about this is that the Blacklist is maintained in real time, and you won't have to rely just on content filtering (the stuff that Scott did) but you'll get a pretty long and decent blacklist of bad sites. So far, in the past few hours I've gotten 100% of the referral spam and no false positives...

We are a few days away from releasing the final version of dasBlog 1.7. A very small number of folks have been running the bits over the weekend and as a result we've fixed a few bugs. A couple more days and we'll post the bits to SourceForge.

When that happens I'll post the MovableTypeBlacklist class. I've also considered writing an HttpModule to send these guys 404s, but didn't really think that was appropriate. The list is basically loaded into a long string, delimited by "|" and passed into a Regex to match a url. Interestingly enough, when I tried to Compile the Regex, my little console app balooned to 150 MB and it never quite finished running. Using a static Regex with the long static string I was able to execute matches in 0 - 10 milliseconds.

Here is a dump of the class:

6p.org.uk : True
Executed in : 20 milliseconds

microsoft.com : False
Executed in : 0 milliseconds

shahine.com : False
Executed in : 0 milliseconds

flatbedshipping.com : True
Executed in : 0 milliseconds

apply-to-green-card.org : True
Executed in : 0 milliseconds

ownsthis.com : True
Executed in : 10 milliseconds