Lessons learned from the Animated Cursor Security Bug
Michael Howard here.
A core tenet of the SDL is to take and incorporate lessons learned when we issue a security update, and there is a great deal to learn from the recent animated cursor bug, MS07-017, so I want to spend a few minutes to go over some of the things we have learned from this bug.
First of all, this code is pretty old; is in Windows 2000, and predates the SDL. The SDL has parts (i.e., design review, threat modeling, testing, and security push) that focus on the product as a whole, and parts (i.e., code review and use of tools) that are focused on code. In the Windows Vista process, we banned certain APIs, like strcpy and strncpy, and changed well over 140,000 calls to use safer calls. memcpy wasn’t on that list. We also built in a lot of defense-in-depth measures because we know that the SDL can’t catch everything. Let’s start by looking at some of the defense-in-depth measures we have in place that didn’t stop the threat:
-GS
The bug is a stack-based buffer overrun in code that looks like this:
HICON LoadAniIcon() {
...
ANIHEADER myANIheader;
memcpy(&myANIheader, untrustedData->headerdata, untrustedData->headerlength);
Because there are no candidate buffers on the function’s stack, there is no -GS cookie added to the stack, even though the code is compiled with -GS. This is not the first time we’ve seen code with no cookie, and this has made us rethink the heuristics used by the compiler when it determines whether to place a cookie on the stack or not. But, changing the compiler is a long-term task. In the short-term, we have a new compiler pragma that forces the compiler to be much more aggressive, and we will start using this pragma on new code.
ASLR & SafeSEH
The next issue is that the code is wrapped in an exception handler that catches code failures. This can usually be good for reliability, but it has an interesting security side-effect: Windows Vista includes address space layout randomization (ASLR), and the goal of ASLR is to reduce the likelihood that an attacker can determine the address of critical functions. This makes it harder to make exploits run correctly. But, if the vulnerable code is wrapped in an exception handler that catches many errors, a failed attempt will not crash the component and the attacker can try again with a different set of addresses. David LeBlanc mentions the dangers of exception handling in a recent blog post.
Note that by itself the ”catch everything” construct is not a security bug, but it can aid an attacker if the exception handler wraps vulnerable code.
Now, let’s look at analysis tools.
Analysis tools
Interestingly, our static analysis tools do not flag this construct as a security bug because it’s a very low-priority warning, and is presently not an SDL “must-fix” warning. Compiling the following code with the C++ compiler in Visual Studio 2005 or the Windows Development Kit...
void Foo() {
__try {
}
__except( 1 ) {
}
}
...yields the warning:
1>c:\code\demo\demo.cpp(36) : warning C6320: Exception-filter expression is the constant EXCEPTION_EXECUTE_HANDLER. This might mask exceptions that were not intended to be handled
We’re investigating this issue further to determine ways of finding exception handlers that may wrap potentially vulnerable code.
The next question is: why did our static analysis tools not find this bug? Code that uses calls such as memcpy is hard to flag as vulnerable without generating a great many false positives. This is a research problem that no one has solved, here or elsewhere. Another angle is to replace calls to memcpy with memcpy_s which forces the developer to think about the destination buffer size. We may ban memcpy for new code, but we still need to analyze this further. Stay tuned.
SAL
Another interesting note is that this code is really hard to annotate with SAL because the function takes a PVOID buffer and the buffer length is buried inside a structure that is passed by reference! SAL can help find many bugs, however. Take the following code:
char Sample(div_t *i, size_t n) {
char c[32];
memcpy(c,(void*)i,n*sizeof div_t);
return c[0];
}
char y = Sample((div_t*)dwArray,21);
This would normally not be flagged as an error, but when you add SAL annotations:
char Sample(__in_ecount(n) div_t *i, size_t n)
The compiler generates the following warnings:
1>c:\code\demo\demo.cpp(55) : warning C6385: Invalid data: accessing 'argument 1', the readable size is '84' bytes, but '168' bytes might be read: Lines: 50, 55
1>c:\ code\demo\demo.cpp(55) : warning C6202: Buffer overrun for 'dwArray', which is possibly stack allocated, in call to 'Sample': length '168' exceeds buffer size '84'
Fuzz Testing
Next: why did our fuzz tests not find the bug? The animated cursor code was fuzz-tested extensively per the SDL requirements. It turns out none of the .ANI fuzz templates had a second “anih” record. This is now addressed, and we are continually enhancing our fuzzing tools to make sure they add manipulations that duplicate arbitrary object elements better.
A Silver Lining: IE Defenses
There is one silver lining: on Windows Vista, known malicious exploits that affected real customers were stopped dead because of UAC and Protected Mode IE; that is not to say exploits could not be written that are UAC/Protected Mode aware.
Our goals are to code well, catch bugs, and reduce the chance that new bugs enter the code. We have a set of features in place to catch things the process misses. We have a learning process and layers of defense in depth because we don’t expect perfect code -- ever.
Summary
In summary, SDL is not perfect, nor will it ever ever be perfect. We still have work to do, and this bug shows that. We have a new -GS pragma that adds more stack cookies; we’ve updated our fuzz tools; we will pay closer attention to exception handlers that could mask vulnerabilities, and we will investigate the impact of banning memcpy for new code. Finally, we will update our education as necessary with lessons learned from this bug.
-Michael