In a story announced last week, Hotmail has released a new version of itself to help users deal with the problem of gray mail.  Gray mail is marketing mail that straddles the line between spam and ham; to some it’s spam, but to others it is legitimate.  This makes it difficult for filters to make a global decision because no matter what action the filters takes at a global level, users will either complain about missed spam or false positives (an example from back in the day were messages from reunion.com).

From the Hotmail blog:

Graph showing Hotmail Inbox 2006

When inbox spam was at 30%, our job was really clear—our enemy, clever as he remains, was impossible to miss. We made huge investments in SmartScreen and reduced spam to historic lows of less than 3%.

With spam at manageable levels, we began looking at the rest of the inbox, and what we found was pretty surprising.

Graph showing Hotmail Inbox 2012

We could easily tell which messages were person-to-person, and we identified spam getting past our filters. The majority of what was left was something we refer to as graymail, and when thinking about how to deal with graymail, it became clear that the fundamental problem wasn’t just which things to accept or reject. Unlike spam, which everyone wants to be rid of, there is no general agreement on how to deal with graymail.

<snip>

Using Hotmail’s categorization tool, you can change the categorization of a message—for example, marking or unmarking it as a newsletter. This generates feedback that the newsletter filter learns from, so it’s able to overcome previous mistakes as well as stay on top of new newsletters. This means the rules set up to deal with newsletters will not just apply to old ones, but also to new newsletters created after you’ve refined the rules to deal with newsletters. The best part is that SmartScreen learns from what customers do with their newsletters, and everyone benefits as the filter gets smarter!

The essence of the feature is that Hotmail’s spam filters are getting better and better trained to identify newsletters and allow its users to categorize the mails efficiently, visually marking them as such so users can navigate their inbox quicker.

Users can the mark or unmark newsletters depending on what they think the message is.  This helps to build a more personalized inbox.

The feature is similar to Gmail’s Priority Inbox which has been around for a little over a year.  It also is similar to our own feature for handling Bulk Mail, which we released 7 months ago.

Yet our feature is also different from Hotmail’s.  Consider their definition of a newsletter:

To get Hotmail to identify newsletters for us, we began by making a list of newsletter characteristics and built a piece of software to extract them from incoming emails. This list forms the model of what makes newsletters different from all other mail and includes three aspects: presence of the List-Unsubscribe header, the sending email address, and what gets shown to the user.

Newsletters that have these characteristics are more often legitimate than not (well, in the past that was the case although it is less true today).  By contrast, our bulk mail filter covers a wider range of email:

Spam …….—>……Bulk mail filter….<—…..….. Good mail

Thus, whereas Hotmail leans more towards legitimate mail, and so does Gmail, we lump dark gray-hat marketers in with lighter gray-hat marketers.

As I have written elsewhere on this blog, bulk mail (and snowshoe spam) is among the most complained about spam today.  But it’s still difficult to differentiate.  The future of spam filtering lies not in detecting malicious spam from botnets, but in personalizing the user experience so that the bulk mail they want does arrive in their inbox.