With all the hoopla surrounding DMARC, I thought I would take the time to see how SPF is functioning in real life, at least in our network.
According to DMARC documentation, SPF and DKIM adoption is reaching critical mass (see page 5 of that link):
Over 75% of mail volume is SPF authenticated.
I decided to cross check this against our own email statistics. Unfortunately, my data is not as thorough as the above. Here is what I have:
It’s unfortunate that we don’t have better data, but it’s the best I have.
The results of messages going through the spam filter of our Forefront Online customer base, for the past 45 days (counts of several billion messages), are below:
You can see that far from 75%+ of messages having SPF records, we see a little over half of them having SPF records. Nearly half of messages have no SPF records (I have no data on total domains with SPF records therefore I cannot correlate between size of senders vs SPF implementation).
You may argue “Well, those messages that are all SPF None, those are spam messages. The majority of legitimate senders use SPF.”
But that’s not true. I also have statistics on what the spam and non-spam proportions are of messages with various SPF dispositions and compare it against the baseline of all of our messages:
The baseline is the network-wide, after IP blocks, spam/non-spam ratio. Green is good mail, red is spam. 90% of our content filtered mail (before bifurcation) is non-spam (it looks so good because we reject so much mail in our IP filters, and bifurcation later on skews the ratio). Anything with green above 90% is better than average.
You can see above that SPF Pass is 94% good mail. That means that there are still 6% of messages that are marked as spam that pass an SPF check. Who are these senders? I don’t know, maybe spammy newsletters, maybe dirty jokes, maybe snowshoe spammers, and so forth.
But messages with SPF none are marked as non-spam 87% of the time. Either we have a very high false negative rate (which is unlikely because we monitor abuse inbox trends) or most mail without SPF records are legitimate. I’m not saying that all of it is, but rather that there are a lot of smaller senders out there that have not implemented SPF records are sending good mail. It’s not as clean as messages that pass SPF checks, but it’s not horrible, either.
Also interesting are SPF soft fails. Its spam/non-spam rate is exactly the same our baseline.
Finally, even SPF hard fails are not marked as spam most of the time. Why is this? The most frequently example is people logging in from a remote location and sending mail from “the hotel” rather than their corporate mail server. Other times people just have their mail servers misconfigured, or not fully populated, or something. In any case, SPF hard fails are not indicative of spam in our post-IP blocked mail.
One caveat for this is that we if had statistics for pre-IP blocked mail, these values would change significantly.
My conclusion in all of this is that SPF checks alone are not that great for detecting spam and phishing; most of the spam is caught with other filtering techniques. Instead, they are better used for validation of the sending domain and then using that as a whitelist or fast-track lane for filtering for domains you want to receive mail from. Also, we still have a ways to go before everyone is using SPF.
Can you confirm that your numbers are talking of SPF checks (so against the Return Path) and not SenderID checks (against from/sender headers)? I expect slightly different results from the two.
They are SPF checks against the SMTP Mail From, not SenderID checks against the message From.
(Late to the party but) In response to "That means that there are still 6% of messages that are marked as spam that pass an SPF check. Who are these senders? I don’t know, maybe spammy newsletters, maybe dirty jokes, maybe snowshoe spammers, and so forth." one big factor IME is user error/bad UI design.
I work with two clients who deal with member registration and submissions, which results in confirmation emails, password reset requests, and delivering notifications the user explicitly signed up for. While a tiny percentage, we do get regular spam reports on feedback loops in response to these 100% legitimate, and often explicitly requested, emails. The best/worst example is when someone requests we email them the status of their earlier submissions, and mark spam when we deliver it a second later.
Then there is the UI factor: I've seen more than one major webmail provider that puts the Delete and Spam buttons right next to each other. No confirmation step on either. Worse yet, sometimes the Spam button will be unavailable for internal emails (ie. the mail provider's own marketing and update emails). When you go delete and go to the next email, the buttons shift over as Spam becomes available again, and now clicking where Delete WAS gets the Spam button instead. I've fallen victim to this myself (ie. as a recipient who did not intent to report an email as spam).