Oh No! Security Metrics!
Hello, Michael here.
A colleague sent me a link to a blog post from a couple of days ago: Pete Lindstrom of Burton Group blogged that Microsoft's SDL has Saved the World!! raising concerns about Microsoft using vulnerability counts as a means to measure security improvement resulting from the SDL.
I've raised this topic before, in my blog post The First Step on the Road to More Secure Software is admitting you have a Problem. Here are two pertinent quotes from that blog post of Feb 21st:
"Let's face it, no-one can agree on any measurement of security without getting knotted up."
"Measuring security is a real challenge, and while we may debate the merits of vulnerability counts, right now it's the only concrete metric we have."
These comments are very important because there appears to be no more widely accepted security metric today, and while no perfect metrics exist, it's useful to have some objective data when trying to discuss this complex subject. Our customers constantly tell us to reduce the number of patches they need to apply to their products once in deployment. It costs them time and money to deploy security updates. The primary metric that matters to customers is the number of security updates they need to apply. And the only way to reduce the number of updates is to systematically reduce the number and severity of vulnerabilities in the code in the first place - that's the goal of the SDL.
In my mind there are two kinds of vulnerability metrics, and I alluded to both in my prior blog post. The first vulnerability metric compares Microsoft to Microsoft, in other words we compare Windows XP to Windows Vista, SQL Server 2000 to SQL Server 2005 and so on. We use this metric while a product is being built; we track incoming security bugs for the prior version of the product to see how we're faring with the current version in development. Fewer vulnerabilities in the product under development is a good sign that the product will fare better in the real world.
The second vulnerability metric compares Microsoft with other vendors. This is an interesting metric, but our group is full of engineers, so we pay little attention to the figures because there is very little we can influence. The post Mr. Lindstrom refers to cited those vulnerability figures I use to point out that other development organizations need to admit they have a secure development problem. Looking back at the figures we cited, it's pretty clear that the sheer volume of security vulnerabilities supports our assertion, regardless of the subtleties of security metrics. More about this below.
Mr. Lindstrom states:
"Microsoft has systematically hired and/or contracted with every one of their most vocal critics (and most seasoned bugfinders) to do the work behind the scenes and they don't count those vulns!"
But in making this assertion, he's saying the vulnerabilities we remove (and do not add to the code in the first place) as part of the SDL process should be counted as though they were part of the product after we shipped it. We don't count vulnerabilities that don't affect customers, regardless of the vendor.
We hire some security researchers to be part of our teams executing the SDL because they're among the best and brightest at performing component design reviews, code reviews, black box testing and other security procedures needed to make our products more secure. Everyone in the industry covets their expertise because it's in short supply, and so we've competed to bring in the most capable people - as employees, contractors and advisors. These experts, helping us execute the SDL, have helped Microsoft eliminate vulnerabilities before our products ship, which naturally means lower vulnerability counts and improved security for our customers. In addition, bringing in researchers helps us to better understand what the community is thinking about today so we can anticipate and head off the problems of tomorrow.
To put it bluntly, we hire security researchers to help protect customers. Period.
While we've made an effort to hire the right researchers to help us improve the security of our products, it's far from the case that we've hired "every one of their [our] most vocal critics." There are still plenty of security researchers who are looking at our products and reporting the vulnerabilities they find after the products ship.
We have found that the training and principles of SDL have indeed significantly improved the products Microsoft engineers create. You improve security by expending effort on improving security. We have seen the evidence of this in the fewer customer updates being released against that code. When applied correctly, the SDL development principles prevent vulnerabilities from entering the final code in the first place. This last point is very, very important: you can't count a bug that was never created; the goal of the SDL is to not create the bugs in the first place.
Some of the many SDL principles that reduce or mitigate security bugs include:
- Mandatory education (Net effect: fewer security bugs up front)
- Design decisions based on threat models (Net effect: fewer security design bugs up front)
- Cannot use known insecure APIs (Net effect: fewer security bugs up front)
- Use of static analysis tools (Net effect: fewer security bugs enter the central source code repository)
- Cannot use known weak crypto primitives and key lengths (Net effect: fewer security bugs up front)
- Compiler and linker requirements (Net effect: extra defenses, in case you miss a bug)
- Fuzz testing (Net effect: implementation bugs found before shipping)
So, to answer Mr. Lindstrom's question:
"Could it really be that SDL has done nothing to help MS developers write better code?"
Without a doubt, the SDL has helped Microsoft developers write better and more secure code.
However, we are still faced with the question whether vulnerability-based metrics are a valid way to measure progress of the SDL. In my opinion, vulnerability counts are a useful metric, but imperfect. We'd welcome Mr. Lindstrom's (and anyone else in the security community) sharing with us the metrics they would use to measure security-related success and how to calculate them. While the notion of what constitutes a "real, objective metric" is often based on individual preference, I think both the efficacy of SDL and the industry as a whole would benefit from this discussion.
Interestingly, Mr. Lindstrom has at times pointed to vulnerability counts as an interesting (but not perfect) metric.
One final comment: If the Microsoft product security vulnerability trend was in the other direction, up and not down, would industry observers claim SDL is failing? I think so.
The SDL works; it's not perfect, we've never said it is, but it's making our customers happier because they have fewer security updates to apply. Not zero, but fewer. And we are always looking for ways to improve how we measure the progress of SDL.
As we've been saying all along, industry dialogue is key - so let us know what you think.