October, 2008

  • The Security Development Lifecycle

    Applying SDL Principles to Legacy Code

    • 1 Comments

    Hello, this is Scott Stender from iSEC Partners, one of the SDL Pro Network partners.  As security consultants, we at iSEC work with a variety of companies to drive security throughout their development cycle.   Clients with mature security processes ask that we help carry out parts of their process, from requirements analysis to penetration testing.  Other clients need help defining their security processes, and we help define and kickoff a program based on the Microsoft SDL, other defined processes, or variations thereof, depending on the client’s needs and abilities.  Whether participating in an existing process or helping define one, I personally have been lucky enough to have seen my fair share of successes and failures, and it is this perspective that I hope to share in this guest post.

    I find that legacy code poses a unique challenge for organizations rolling out a new security process.  Often, the resources dedicated to maintaining older code are a small fraction of those devoted to new features or products.  Furthermore, the original developers for such features have often moved on, leaving no subject matter experts to drive reviews.  The astute reader will ask “How do I apply the principles of the Microsoft SDL to legacy code when I have no development resources and nobody knows how it works?”

    The answer is “Start small, and build expertise over time.”

    A Rising Tide Lifts All Boats

    The best thing a security engineering team can do to improve security in the short term is to drive code quality, and the first step in this process is to define and enforce a secure coding standard.  This helps on two fronts: 

    1.       It will improve code quality and reduce implementation flaws across the entire code base.  Unlike other security processes, driving a secure coding standard is relatively easy to accomplish across an entire code base, regardless of the code’s age, by a focused security team.  That is not to say that it is easy without qualification – a large batch of spaghetti code will require a lot of work to untangle!  Such an effort can only be called “easy” when compared to, say, comprehensive identification and remediation of design flaws across legacy features.  Even so, improving code quality through the use of secure coding standards offers a unique combination of high impact, applicability to features, and ability to be carried out by a core team that makes it a sensible first step.

     

    2.       The security team might notice that some sections of code have more standards violations or outright flaws than others.  This is an instance of vulnerability clustering, a concept that has been used to predict vulnerability rates and improve quality in the functional realm.  The evidence is anecdotal, but it stands to reason that portions of code that consistently violate secure coding standards are good places to start looking for other classes of security flaw.  These are security hotspots, and should be high on the prioritized list for further review.

    Security testing may also be applied to legacy code, but initial activities should be considered on a case-by-case basis based on the expected return on investment.  Such testing ranges from using inexpensive off-the-shelf tools to exercise common interfaces to rather expensive custom testing and formal analysis.  It is worthwhile to begin with off-the-shelf tools, such as those that target file parsers or web applications, and tools created as part of your greater secure development efforts.  These can help identify easily-found flaws and suggest improvements to the coding standards.  Comprehensive security testing, on the other hand, is best tackled after the Legacy Security Push.

    The Legacy Security Push

    Coding standards and basic testing provide bang for the buck, but formal security processes seek to provide security assurance.  The challenge for legacy code is that it needs to play catch-up.  Security processes that occur early in the development cycle, such as requirements analysis, design review, and threat modeling, are particularly difficult to achieve years after the fact.  The main goal of the Legacy Security Push is to create the deliverables from these efforts, the most important of which are security requirements and a full risk analysis.

    It may sound trivial, but security requirements are essential.  Not only do they define proper operation for the system in question, they also define assumptions that are suitable for relying systems.   It is very common to find security flaws in legacy systems that arise from well-intentioned but incorrect assumptions such as “I assume that the Foo authenticates server Bar when initiating a bank transfer.”  It stands to reason that Foo would do so for such an important activity, but this assumption must be validated.  It is very common for older features to have been deployed in and written for different environments where the security assumptions that are "obvious" today just didn't apply at the time.

    When reviewing legacy systems, the first step is to identify such requirements.  If the original architects, developers or managers are available, they can provide valuable insight at this stage.  More often than not this is not the case, and analysis must instead rely on what documentation is present and interaction between the software and its consumers.  The goal is the same as in requirements analysis during project inception, except that in this case one must turn the process on its head and reverse engineer requirements from system behavior.  At the conclusion of this effort, requirements can be theorized – “Foo must authenticate its server Bar before initiating a bank transfer.” 

    Risk analysis can be performed once a plausible set of requirements have been identified.  Threat modeling is a more structured means of performing such an analysis, with the eventual goal of identifying means by which requirements can be violated by an attacker. 

    As with requirements analysis, original developers would be a valuable resource to consult.  With or without such help, the first step is to identify how the software works.  In many cases, help is not available and performing this task requires a great deal of effort.  For features of moderate size, this author has spent upwards of a month reading code, using process profiling tools, and walking through the software with a debugger to identify program flow and security-sensitive functionality.

    Once completed, actual system behavior should be documented and compared against the requirements theorized.   It might be that the requirements should be re-evaluated (New requirement:  Do not assume that Foo requires server authentication) or the system may need to be changed (New bug:   Foo does not verify the CN for Bar).  At the end, this information should be sufficient to support a comprehensive threat modeling exercise where security requirements, risks, and their mitigations can be documented.

    Next Steps

    Bringing a legacy feature up to par with its newer kin requires a relatively small number of items:  improved code quality, clear security requirements, and a thorough threat model.  As we have seen, performing even these tasks is quite the effort!  I am sure that it is little comfort to be reminded that accomplishing these tasks has simply laid the foundation, and that the true benefit is that the newly-reviewed legacy feature is able to participate fully in the security processes that remain: reviewing cross-component security requirements and assumptions, comprehensive testing, and incident planning, to name a few.

    Unfortunately, there is no silver bullet in security assurance.  The soundness of the design and implementation of legacy software is just as important as in newer software, which is why any complete secure software development process will look backwards as well as forwards.  Feature by feature, from higher priority to lower, the overall security of the software improves as legacy code receives the full security treatment it deserves.

    Did you find the silver bullet?  Might you think that defining security requirements is unnecessary?  Perhaps “It is old and has not been attacked yet.” is a valid security strategy!  Please comment below or email me directly at scott@isecpartners.com and share your thoughts.
  • The Security Development Lifecycle

    MS08-067 and the SDL

    • 10 Comments

    Hi, Michael here.

    No doubt you are aware of the out-of-band security bulletin issued by the Microsoft Security Response Center today, and like all security vulnerabilities, this is a vulnerability we can learn from and, if necessary, can use to shape future versions of the Security Development Lifecycle (SDL).

    Before I get into some of the details, it's important to understand that the SDL is designed as a multi-pronged security process to help systemically reduce security vulnerabilities. In theory, if one facet of the SDL process fails to prevent or catch a bug, then some other facet should prevent or catch the bug. The SDL also mandates the use of security defenses, because we know full well that the SDL process will never catch all security bugs. As we have said many times, the goal of the SDL is to "Reduce vulnerabilities, and reduce the severity of what's missed."

    In this post, I want to focus on the SDL-required code analysis, code review, fuzzing and compiler and operating system defenses and how they fared.

    Code Analysis and Review

    I want to start by analyzing the code to understand why we did not find this bug through manual code review nor through the use of our static analysis tools. First, the code in question is reasonably complex code to canonicalize path names; for example, strip out ‘..' characters and such to arrive at the simplest possible directory name. The bug is a stack-based buffer overflow inside a loop; finding buffer overruns in loops, especially complex loops, is difficult to detect with a high degree of probability without producing many false positives. At a later date I will publish more of the source code for the function.

    The loop inside the function walks along an incoming string to determine if a character in the path might be a dot, dot-dot, slash or backslash and if it is then applies canonicalization algorithms.

    The irony of the bug is it occurs while calling a bounded function call:

    _tcscpy_s(previousLastSlash, pBufferEnd - previousLastSlash, ptr + 2);

    This function is a macro that expands to wcscpy_s(dest, len, source); technically, the bug is not in the call to wcscpy_s, but it's in the way the arguments are calculated. As I alluded to, all three arguments are highly dynamic and constantly updated within the while() loop. There is a great deal of pointer arithmetic in this loop. Without going into all the gory attack details, given a specific path, and after the while() loop has been passed through a few times, the pointer, previousLastSlash, gets clobbered.

    In my opinion, hand reviewing this code and successfully finding this bug would require a great deal of skill and luck. So what about tools?  It's very difficult to design an algorithm which can analyze C or C++ code for these sorts of errors.  The possible variable states grows very, very quickly.  It's even more difficult to take such algorithms and scale them to non-trivial code bases. This is made more complex as the function accepts a highly variable argument, it's not like the argument is the value 1, 2 or 3! Our present toolset does not catch this bug.

    Ok, now I'm really going out on a limb with this next section.

    Over the last year or so I've noticed that the security vulnerabilities across Microsoft, but most noticeably in Windows have become bugs of a class I call "onesey - twosies" in other words, one-off bugs. There is a good side and a bad side to this. First the good news; I think perhaps we have removed a good number of the low-hanging security vulnerabilities from many of our products, especially the newer code. The bad news is, we'll continue to have vulnerabilities because you cannot train a developer to hunt for unique bugs, and creating tools to find such bugs is also hard to do without incurring an incredible volume of false positives. With all that said, I will add detail about one-off bugs to our internal education; I think it's important to make people aware that even with great tools and great security-savvy engineers, there are still bugs that are very hard to find.

    Fuzz Testing

    I'll be blunt; our fuzz tests did not catch this and they should have. So we are going back to our fuzzing algorithms and libraries to update them accordingly. For what it's worth, we constantly update our fuzz testing heuristics and rules, so this bug is not unique.

    Defenses

    If you want the full details of the defenses, and how they come into play on Windows Vista and Windows Server 2008, I urge you to read the SVRD team's in-depth analysis once it is posted.

    A big focus of the SDL is to define and require defenses because we have no illusions about finding or preventing all security vulnerabilities by attempting to get the code right all the time, because no-one can do that. No one.  See my comment above about one-off bugs!

    Let's look at each SDL mandated requirement and how they fared in light of this vulnerability.

    -GS

    The -GS story is not so simple. A lot of code is executed before a cookie check is made and the attacker can control the overflow because the overflow starts at an offset before the stack buffer, rather than at the stack buffer itself. So the attacker can overwrite other frames on the call stack, corresponding to functions that return before a cookie check is made. That's a long way of saying that -GS was not meant to prevent this type of scenario.

    ASLR and NX

    The code fully complies with the SDL, and is linked with /DYNAMICBASE and /NXCOMPAT on Windows Vista and Windows Server 2008. There are great defenses when used together, and reduce the chance of a successful attack substantially. Also, the stack offset is randomized too, making a deterministic attack even more unlikely.

    Service Restart Policy

    By default the affected service is marked to restart only twice after a crash on Windows Vista and Windows Server 2008, which means the attacker has only two attempts to get the attack right. Prior to Windows Vista, the attacker has unlimited attempts because the service restarts indefinitely.

    Authentication

    Thanks to mandatory integrity control (MIC) settings (which comes courtesy of UAC) the networking endpoint that leads to the vulnerable code requires authentication on Windows Vista and Windows Server 2008 by default. Prior to Windows Vista, the end point is always anonymous, so anyone can attack it, so long as the attacker can traverse the firewall. This is a great example of SDL's focus on attack surface reduction; requiring authentication means the number of attackers that can access the entry point is dramatically reduced.

    Firewall

    We enabled the firewall by default in Windows XP SP2 and later, this was a direct learning from the Blaster worm. By default, ports 139 and 445 are not opened to the Internet on Windows XP SP2, Windows Vista and Windows Server 2008.

    Summary

    The $64,000 question we ask ourselves when we issue any bulletin is "did SDL fail?" and the answer in this case is categorically "No!" No because as I said earlier the goal of the SDL is "Reduce vulnerabilities, and reduce the severity of what you miss." Windows Vista and Windows Server 2008 customers are protected by the defenses in the operating system that have been crafted in part by the SDL. The development team who built the affected component compiled and linked with the appropriate settings as described in "Windows Vista ISV Security" and Writing Secure Code for Windows Vista so that their service is protected by the operating system.

    The team did not poke holes through the firewall unnecessarily, in accordance with the SDL.

    The team reduced their attack surface, in accordance with the SDL, by requiring authenticated connections rather than anonymous connections by default.

    We know that the SDL-mandated -GS has very strict heuristics so some functions are not protected by a stack cookie, but in this case, there is no buffer on the stack, so there will be no cookie. We know this. There are no plans to remedy this in the short term.

    Fuzzing missed the bug, so we will update our fuzz testing heuristics, but we continually update our fuzzing heuristics anyway.

    In short, based on what we know right now, Windows Vista and Windows Server 2008 customers are protected because of the SDL-mandated defenses in the operating system, and because the development team adhered to the letter of the SDL to take advantage of those defenses.

    Chalk one up for Windows Vista and later and the SDL!

    As usual, questions and comments are very welcome.

  • The Security Development Lifecycle

    Good hygiene and Banned APIs

    • 5 Comments

    Jeremy Dallman here with a quick note about a code sanitizing tool we are making available to support one of the SDL requirements – Remove all Banned APIs from your code.

    This requirement was put in place to prevent use of certain older C runtime functions that lead to buffer overrun flaws and have been deprecated. In the Security Development Lifecycle book, an entire chapter is dedicated to the topic of banned function calls. In the book, we also provide a copy of the banned.h header file on the companion CD. This header file allows you to locate any banned functions in your code.

    On MSDN, we have document the SDL list of Banned Function Calls, but the header file has not been publicly available outside the SDL book until now. Today, we are providing the banned.h header on the Microsoft Download Center.

    Find the banned.h header here

    By including this header file, then using #include “banned.h”; you will be able to locate any banned functions in your code. The full list of banned APIs is also included in the header file.

    Alternately, if you are using the compiler in Visual Studio 2005 or later, you have a built-in way to check for these banned functions. To catch banned C runtime functions, you can compile with /W4 and then triage all C4996 warnings. In code reviews, you should always remove any code that disables the C4996 warnings - e.g.: #pragma warning(disable:4996). This is one simple way to ensure your code is released without banned functions.

    Sanitizing your code to remove potentially insecure APIs is a vital protection. Whether you include the banned.h header file or leverage the /W4-C4996 warnings in the Visual Studio 2005 compiler, you now have two ways to check your code and meet another SDL requirement in your development phase.

  • The Security Development Lifecycle

    Experiences Threat Modeling At Microsoft

    • 2 Comments
    Adam Shostack here.  Last weekend, I was at a Security Modeling Workshop, where I presented a paper on “Experiences Threat Modeling at Microsoft,” which readers of this blog might enjoy.  So please, enjoy!

    And while I’m at it, I wanted to draw attention to some of the other presentations that I thought were very interesting, including one by Karine Peralta “Specifying Security Aspects in UML Models” and “Curriculum for Modelling Security: Experiences and Lessons Learned.”

  • The Security Development Lifecycle

    Mitigating Exploitation Techniques

    • 2 Comments

    Hi, Matt Miller from Microsoft’s Security Science team here to talk about exploitation & mitigation.

     

    Over the past decade exploitation techniques have been developed and refined to the point that very little expertise has been needed to successfully exploit software vulnerabilities.  These refinements have lowered the bar for attackers and drastically increased the probability that an attack will be successful.  This has led to the need for mitigation techniques that can prevent or otherwise reduce the reliability of a given exploitation technique.  In relation to one another, we can think about exploitation techniques as attempting to drive the probability of successful exploitation to 100%, whereas mitigation techniques attempt to drive the same probability to zero.  While probability gives us a nice measure for the effectiveness of a mitigation technique, it doesn't give us immediate insight into the specific problems being solved by mitigations or the techniques that are being used to solve those problems.

                                                                                                                                                                                                                                                            

    Understanding the problems that are solved by mitigations is what provided the motivation for the presentation I will be giving at BlueHat.  Many of the materials in this presentation were taken from my work with Leviathan Security Group and have been repurposed to focus on taking attendees on a journey through the technical evolution of the mitigation techniques developed by Microsoft.  This evolution is illustrated in terms of the problems each mitigation technique is attempting to solve, the methods used to solve them, and how well each mitigation has stood the test of time thus far.  The journey itself starts first with /GS and ends with a glimpse of the mitigation techniques we might expect to see in the future. 

     

    It is my hope that this presentation will illustrate that mitigations, when working in concert with one another, can be an effective method of helping to keep users secure by reducing the probability of a successful exploitation attempt for the majority of known exploitation techniques.
Page 1 of 1 (5 items)