There’s been a number of articles on Rustock lately so I thought I’d chime in with my take, which isn’t that novel but I have to inflate my post count.  Techworld recently reported that Rustock, which started sending spam over TLS, has stopped doing so:

The Rustock mega-botnet appears to have ditched the experimental use of TLS (transport layer security) to obscure its activity, Symantec has reported.  Rustock’s use of TLS is now averages between 0.1 and 0.2 percent of all spam, peaking at 0.5 percent, a tiny fraction of the levels seen in March when it reached averages of around 25 percent with a peak of as much as 77 percent.

The key moment was on 20 April, when the volume of spam featuring the tactic suddenly plunged to sub-one percent levels after an equally sudden rise in rates in the weeks prior to that date.  TLS adds a small but cumulative overhead to server email processing, which ties up mail servers but also affects the rate at which spam is sent. Why Rustock’s controllers adopted the technique at all was never clear but might have been connected to a misplaced belief that it would make it harder for servers to filters its activity or detect the command and control system used to direct its activity.

“It would seem that the botnet controllers, especially those behind Rustock, have perhaps realised that the use of TLS gave them little or no discernable benefits, and instead impeded their sending capacity owing to the additional bandwidth and processing overhead needed for TLS,” reckons the August 2010 MessageLabs Intelligence Report.

Back in March, I originally reported that we were seeing an increased amount of spam from Rustock sent over TLS.  As the authors of MessageLabs’ Report conclude, it’s unknown why they would have used it in the first place.  There really isn’t that much benefit to using TLS to send spam.  No spam filter worth its salt uses TLS as a mechanism for trust in and of itself so it wouldn’t aid in delivery.  It’s possible that they may have thought that using it made detection more resilient since it would be more difficult to detect the command-and-control nodes, but this doesn’t make sense either.  The nodes who call back to the C&C centers would need to have those communication channels encrypted, that is, updating the instructions for what purposes they are going to do should be encrypted to prevent disruption.  However, sending messages over an encrypted channel to mail recipients has no benefit – the payload (spam) is the same and looks the same to the end user since it must be displayed in clear text.

The maintainers of Rustock probably determined this and decided to abandon the trick.  Perhaps they thought that end-to-end encryption was a useful technique, but it really is not.  It doesn’t buy them anything and in fact is very cost intensive.  Heck, Hotmail doesn’t even do TLS so maybe they figured that their target audience wasn’t even worth the effort.  As the above article says, as soon as they dropped TLS a few weeks ago the bot was able to double its throughput.

On our side, you may recall that we originally noticed it when we discovered that our CPU utilization had spiked up.  This occurred in December 2009 (although the utilization problem didn’t become evident until a few weeks later).  An investigation demonstrated that it was because we were attempting to negotiate all of these TLS connections.  We did some digging and collected some connecting IP addresses for a small snapshot of a single day and then compared them against a list of known botnets.  It confirmed what I suspected, most of these connections utilizing TLS were coming from IPs associated with Rustock.

We then quickly got to work.  And the work occurred with amazing rapidity.  We implemented a couple of fixes:

  1. The short term fix.  The short term fix was to avoid advertising STARTTLS using some of our routers.  We had the router do an IP blocklist lookup before it hit our Exchange mail servers, and if it was on a blocklist it was diverted to a different port (say, port 28 instead of port 25 which is where email is normally done) and pool of servers which did not advertise STARTTLS.  Thus, we were doing a blocklist check using the router instead of the mail server.  This worked because immediately our CPU utilization dropped.

    However, this was only a short term mitigation.  It wouldn’t scale; if we ever wanted to add more mail servers and take on more traffic, we’d have to make sure that the router could handle the connections.  Our mail servers do things like logging that the router does not do.  The long term fix was to ensure that that this problem would be taken care of automatically no matter how many mail hosts we added.

  2. The long term fix.  This is a fix that was made and ported into Exchange 14 (our service is the largest consumer of Exchange 14 anywhere in the world).  What happens here is that STARTTLS is delayed.  First, an IP blocklist check is done (either through an rbldns query or some other mechanism such as a local on-the-box call) and if it is on the list, STARTTLS is not advertised.  The rationale behind this that rather than advertising it from the start, a decision is made to see whether or not the connecting IP is not trustworthy.  If not, we don’t advertise STARTTLS and the connection is rejected.  It is conditional upon verification of reputation.

    This fix went out in March or April (I can’t remember) and it worked immediately.  Our CPU utilization continued to remain low.

Of course, a few weeks ago, the Rustock botnet decided to stop sending spam over TLS altogether.  That means that all the hard work our developers and testers went through is now all for naught.  All we would have had to do is not do anything and the problem would have resolved itself.  However, we all certainly learned a lot: spammers may be able to operate quickly, but then again, so can we.