Office Crypto Follies

What I've been working on lately that has kept me from doing nearly anything else can be found at:

http://msdn.microsoft.com/en-us/library/cc313071.aspx

MS-OFFCRYPTO is very detailed documentation of exactly how we do cryptography for binary and OOXML documents. Overall, it covers:

  • IRM
  • Encryption and obfuscation
  • Password to modify
  • Digital signing

Before someone else figures out some of this and makes fun of us on Slashdot, let me be the first to detail what's really going on. Hang on – some of it is (in the immortal words of Warren Zevon) not that pretty at all. On a happier note, in the even more immortal words of Monty Python, "it got better" – what we're shipping now is quite good (for encrypting OOXML documents), and we have plans to make it even better.

Let's start with the worst of it – XOR. You may note that I consistently refused to ever say "XOR encryption", preferring the more accurate "XOR obfuscation". Not only is it the worst way to protect a document, but it was horrible to try and explain. We did all sorts of silly things to make this hard to figure out, it did nearly nothing to actually protect the data, but it sure was no fun to try and document in a normative style. I believe this obfuscation dates back to around 1994. Here's some pseudo-code to show you the sheer horror of it all – this is from one of the two password verifier approaches:

FUNCTION CreatePasswordVerifier_Method1
PARAMETERS Password
RETURNS 16-bit unsigned integer
DECLARE Verifier AS 16-bit unsigned integer
DECLARE PasswordArray AS array of 8-bit unsigned integers

SET Verifier TO 0x0000 SET PasswordArray TO (empty array of bytes)
SET PasswordArray[0] TO Password.Length

APPEND Password TO PasswordArray
FOR EACH PasswordByte IN PasswordArray IN REVERSE ORDER
IF (Verifier BITWISE AND 0x4000) is 0x0000
SET Intermediate1 TO 0
ELSE
SET Intermediate1 TO 1
ENDIF

SET Intermediate2 TO Verifier MULTIPLED BY 2
SET most significant bit of Intermediate2 TO 0
SET Intermediate3 TO Intermediate1 BITWISE OR Intermediate2
SET Verifier TO Intermediate3 BITWISE XOR PasswordByte

ENDFOR
RETURN Verifier BITWISE XOR 0xCE4B
END FUNCTION

We'd have been much better off just taking a CRC16 of the password. I wanted to see just how bad this was, and wrote up a quick app to try the first 2^40 alpha passwords, and I started seeing cycles in the collisions. Values would go from under-represented to over-represented very quickly. Further inspection shows that for reasonably short passwords, you can immediately tell the number of characters from this value. Seems that for a 1 character password, the first 7 bits vary, 2 characters vary 8 bits, and so on.

Oddly enough, this is so bad that it actually has a benefit. There's so many collisions that while you may find a password that will work, there's no assurance that you found the password used to obfuscate the file, so you're not as likely to be able to get into other things by brute-forcing the 16-bit verifier. Somewhere around 8 billion passwords only generate about 16,000 verifiers, so there's literally hundreds of thousands possible passwords that could have created any given verifier.

What we have here is that someone who is actually a very good general dev (he's now a well thought of dev manager) who tried to roll his own crypto, and implement a simple hashing function. Moral of the story is DO NOT DO THIS.

If you look deeply into the obfuscation array initialization, you'll see another fairly ghastly mistake – the part of the array that coincides with the number of characters in the password actually varies, but the remainder of it is initialized based on values hard-coded into the binary along with values that end up in the document. This makes it possible to write a tool to directly extract quite a bit of information, and then there's the obvious disaster of what happens when you XOR chosen plain-text.

Our next attempt at encryption first showed up in Office 97, and featured RC4. As those of you who are familiar at all with encryption, RC4 is really hard to do correctly, and this is an example of most of the mistakes you can make with RC4. The number one rule of stream ciphers is to NEVER re-use a key stream, since the crypt text is the result of the cipher stream XOR'd with the plain text. If you reuse a key stream, you can XOR them to get the XOR of the two input plain texts, and that's often very easy to sort out. We did this more than once. Next rule of RC4 is that you have to have an integrity check, or you're subject to bit-flipping attacks – there's no integrity check. Finally, you should toss out the first 1k or so of the cipher stream, but we didn't know to do that.

Next, take into account that encryption was considered a munition at the time, and we were limited to 40-bit initial keys. These days, you can work your way through a 40-bit key space in minutes using only one computer. No need to bother with password cracking, just go directly to the key and attack it. The moral of this story is that agile encryption is a serious requirement, because while it may have taken some time to brute force 2^40 keys on a 286, a modern system will make short work of it – and in fact, it would be possible to store all the keys in a single file – it would only consume about 18 GB (much less if we did it as a tree of some kind). A second and more interesting flaw that our tester uncovered was that they thought they were doing an iterated hash (though only 16 iterations), but what they were doing in reality was concatenating the first 40 bits of the MD5 hash of the password, a 16 byte salt, and then repeating this concatenation 16 times. The old encryption library we used made this an understandable error, and it's harder to do with CryptoAPI or CNG. Moral of this story is that crypto is unlike any other code, and you should always get an expert to review what you're doing.

The RC4 encryption was then "strengthened" to use CryptoAPI, and could be configured to go to 128-bit keys, though unfortunately, the old 40-bit stuff was still default – this all happened in Office XP. Sadly, most of the implementation flaws remained. I found one place where there was triple key stream re-use (though only for 8 bytes) in the same spot. The unfortunate attempt at an iterated hash dropped back to a non-iterated hash of the password for reasons I don't understand. Some of the applications, notably PowerPoint, don't suffer as much from key stream re-use as others, and if you chose to use 128-bit RC4 and used a good password, a presentation could be relatively well protected.

As of Office 2007, we do warn you that the encryption we do on the binary documents is weak. Most of the time, it's so weak that it will only act as a mild deterrent. In some cases, we missed encrypting things entirely (which is actually called out in a KB article some time ago). My advice is that if you must encrypt a binary document, use a 3rd party tool to do it. These flaws are called out in the Security Considerations section of MS-OFFCRYPTO.

When we get to OOXML document encryption, the picture gets a lot better. I personally fixed some of the problems. We moved to AES, which is a block cipher, went to an iterated hash that far exceeds RFC2898 (50,000 cycles), and as I stated in a previous post, don't forget your password, or you may never see your data again. The only problems we had here was that we didn't support CBC (cipher block chaining) mode, and we didn't have an integrity check, though a 1 bit flip would result in 16 bytes of junk, so it isn't as high priority a problem as with RC4. We'll address both of these problems in the future, as well as some other improvements I'll talk about once we ship it.

If you take a good look at the ODF specification (as of 1.1) with regards to crypto, you see some of the same sorts of issues. They do some interesting things:

  • The number of times to iterate on the password hash is relatively low (1000), and is a fixed value.
  • The encryption algorithm must be Blowfish. Blowfish may well be a nice algorithm, but it hasn't been seriously validated, isn't FIPS compliant, isn't Suite-B compliant, and simply can't be used by some customers that require FIPS, Suite-B, or the analogues in their countries. Bruce Schneier himself has this to say about it – "At this point, though, I'm amazed it's still being used. If people ask, I recommend Twofish instead." It would be better to require an algorithm that has been validated, and better yet to allow it to be agile.
  • The integrity check and the password verifier are the same thing. There is no way to know whether the user has the wrong password, or whether a bit got flipped. This has the potential for data loss, though I'd suppose one could build a special tool to just try the decryption.

The really good news is that we have some people who are seriously good with cryptography on the Office TWC team now – our tester is really sharp, we've got a PM who previously worked in the crypto group in Windows and is on the Microsoft-wide crypto board, and we have some devs who know this stuff as well. I'm happy to say that when you encrypt an OOXML document that it will be very hard to brute force the password and retrieve the information – and it will keep getting better.

Lies, Damn Lies, Information Leaks, and Statistics

Robert Hensing posted some criticism of a study that purported to analyze how many users are at risk due to using out of date or unpatched browsers. Rob rightfully points out that you can actually be running a very old version of IE (depending on OS), and still be patched against current attacks.

A flaw that IE doesn't have is advertising to the server the exact minor version of the application. People often underestimate the value of information leaks – advertising the exact minor version is basically saying "Hello, you may attack me with these exploits, but I'm patched against those exploits." You can often figure this out with various fingerprinting techniques, but sometimes you can't. As it turns out, Safari, Firefox and Opera all have information disclosure flaws, and these were used to estimate the number of vulnerable browsers by examining Google's server logs. Because IE doesn't advertise this information to the server, they couldn't do a valid comparison, and dropped to a different data set.

Rob admits to not having much statistical training, but having spent far too much time in graduate school, I've had quite a bit. First thing to consider is the sample size. As it turns out, you don't need that many samples to be valid. Secunia's sample was around 500,000, which is more than adequate. The next thing to consider is whether we're really dealing with 2 different populations. To their credit, they do call this out:

Secunia [21] identified (for the month of May 2008) that 4.4% of IE7, 8.1% of Firefox, 14.3% of Safari (Windows only), and 15.2% of Opera users have not applied the most recent security patches available to them from the software vendor. In comparison, we discovered that 16.7% of Firefox, 34.7% of Safari (all OS), and 43.9% of Opera Web browser installations (using our Web server log-based measurements) had not applied the most recent security patches. We found that our Firefox, Safari, and Opera results were higher than those of Secunia's, differing by a factor of 2.1 (Firefox), 2.4 (Safari), and 2.9 (Opera), and attribute this difference to a probable bias for more security aware users to take advantage of Secunia's security scanner PSI than the average global community.

First of all, we can clearly establish that we're dealing with 2 distinctly different populations. The assertion that IE6 is insecure is invalid, as it is still in support and gets security patches just like IE7, and while IE 7 is doing a bit better on bulletin count, it isn't a huge difference, and as I've noticed with several other products, there is a grace period where attackers don't bother until there's enough adoption. While it is interesting that around twice as many Firefox users aren't fully patched as IE7 users, this might be an artifact of release timing. The authors of the study then attempt to deal with these different populations by comparing the Google results to the Secunia results, but there's a lot of variance between the browser types – if Firefox users going to Google are 2.1x less likely to be secure than Firefox users identified by the Secunia study, but Opera is different by a factor of 2.9x, then the difference between IE users overall vs. Secunia is really anyone's guess. Is it 2.9x? 4x? 1.5x? No one really knows.

It's an interesting thing to try and study, and the hypothesis that different patch delivery mechanisms might make a difference in how many users are at risk is also interesting, but data on IE users who are the majority of the population, and could behave differently as a group than users of other browsers, is really not available which makes the conclusions very questionable. Another factor that they appear not to have considered is that the number of browsers missing patches is going to be a function of how often you see patches. Something patched once a year is more likely to be patched than something patched 25 times a year.

Interesting paper – too bad their conclusions aren't supportable for the bulk of the users who are using IE.

 

Yikes! Vista Security to be Obliterated!

Just picked up this link from Robert Hensing's blog - http://www.builderau.com.au/news/soa/Vista-security-to-be-obliterated-at-Black-Hat/0,339028227,339290040,00.htm. Seems Mark Dowd is going to be doing a presentation on how to bypass some of the defenses used by programs. I wrote the section on that for Writing Secure Code for Windows Vista, and I'm curious to see what he's come up with.

First of all, I'm going to paraphrase Jon Pincus again – there are 2 properties of a countermeasure that are important:

  • Any effective countermeasure will stop some exploits completely
  • For any given countermeasure, a sufficiently complex attack may be able to evade the countermeasure

This was the topic of a somewhat infamous exchange between myself and Gary McGraw some years ago. Gary found a way to evade an early version of /GS, and we hadn't claimed it was perfect, just better.

Let's review what I know about these defenses:

ASLR – there is a limited amount of randomness in where things show up in memory – only 8 bits. In some cases, you can get 16 bits because you might have 2 things moving independently (e.g., where the stack is and where a DLL loads). Problems you can run into include information leaks, or crashes that tell you things about where DLLs are in memory. Once a DLL is loaded, it stays in the same place. For DLLs used by a lot of apps, this could persist until you reboot. Thus an information leak in one app might help refine an attack against another. It is also true that a poorly implemented exception handler could make ASLR ineffective by allowing as many attacks as you need. This was seen in the .ani exploit. I've written about this problem previously here. ASLR also may not be applied to all DLLs, and that gives you a non-moving target. Additionally, if the exploit mechanics allow it, you might be able to code something that depends on an offset rather than finding a fixed address. For example, if you can jump to a predictable offset from where you are to something that calls the function you want to hit, then you have a way to find the function, even if you cannot predict where it is and go there directly.

DEP – first problem is that a lot of hardware vendors don't turn it on, and "software DEP" isn't really DEP, but more like SafeSEH (sort of). If you're down to "software DEP", it might prevent an exploit that wasn't looking, but it won't help that much. Next problem is the return into lib C issue, and a related nuance – if the exploit code you want to run can be constructed from code that's already in the app, you don't really need to execute shell code, just find something that's already there. You could also do data driven attacks, like how the original getadmin exploit against NT 4.0 just flipped a bit in the kernel. Second set of problems is finding a place to put your code. RWX pages do show up in apps for various reasons, and if you have enough control over your exploit to find that spot and put your shell code there, DEP won't save anyone. There's also the rather neat attack Mark did against byte code – if something essentially compiles input and runs it, you might be able to alter that, and DEP can't help with that problem, either.

SafeSEH – this is good to have, but it isn't one of the more robust mechanisms. Coupled with DEP, it isn't too bad. My preference here is to go to 64-bit code, where the exception handlers aren't writable.

I'm sure that Mark's clever enough to come up with something I haven't thought of – be interesting to see what he presents on that goes beyond what I have here.

Also, as I've said many times, all this stuff is just like seatbelts. They might help make the difference between a bad day and a REALLY bad day, but you don't want to have to use them. I would never punt a fuzz bug because of one of these. That said, I still recommend using seat belts. It's also true that with these countermeasures in place, there will be some set of previously exploitable conditions that can no longer be exploited at all because a sufficiently complex attack cannot be launched.

Don’t Feed or Tease the Bears…

I've learned over the years to avoid bragging about how much more secure something is than something else. We used to have lots of these debates back at ISS. It was inevitable – whoever was going on about how their OS was more secure than your OS had a root exploit show up for their OS that week. We finally came to the conclusion that plugging in the network cable was the worst thing you could ever do, and anything you did before or after that didn't have much effect on the outcome…

Personally, I like the approach of a good sports coach. The interview usually goes about like this:

"Coach, what do you think about the game that your unbeaten Crushers are going to play against the 0-12 PeeWees tomorrow?"

"Well, it could be a tough game. We'll just have to play our best and see how it goes."

I was reading Robert Hensing's blog today (found here), which referred to a really fluffy interview with Window Snyder, where she took quite the opposite approach:

In setting out to elevate Firefox's basic security, Snyder is also compelling Microsoft and Apple, maker of the Safari browser, to follow her lead — or get out of the way.

Snyder's rising star is sure to ascend even more this week, with the release of Version 3.0 of Firefox on Tuesday. The release is packed with new features, most notably stiffer security, faster speed and improved ease of use.

In keeping with my observation of Murphy's law back at ISS, I wasn't exceptionally surprised to see this post by Ryan Naraine:

Code execution vulnerability found in Firefox 3.0

Just hours after the official release of the latest refresh of Mozilla's flagship browser, an unnamed researcher has sold a critical code execution vulnerability that puts millions of Firefox3.0 users at risk of PC takeover attacks.

So how do I think our next release is going to do? Well, gee, there are a lot of determined people out there who are trying to find and sell exploits. We're going to work hard, do our best, and I hope it comes out well. I also think things like NX, ASLR, following the SDL, and doing our best to use least privilege might give us a leg up, but we'll see how it goes.

Sorry not to be posting lately – been really, really busy. I promise I'll have an interesting post about it when I'm done. This is also the time of year I tend to spend with my horse in the mountains…

More on Checking Allocations

Seems my last post met with some objections – somewhat rightfully so, as I mischaracterized one of Tom's points – he never advocated just not checking for allocations, but instead to use an allocator that has a non-returning error handler – though it seems some of his commentors were advocating that (I think they should go to rehab). This isn't always a bad way to go – I recommend using SafeInt the same way in some conditions. There are some problems that make allocators special, though. For example, say I write a DLL that allocates internally. If the DLL just calls exit() when it can't allocate, this could cause some really bad experiences for users. Same thing if the DLL just tosses some exception that the client isn't expecting. The right thing for such code to do is to either trap the error, and propagate it back up the stack, or be exception-safe code, and throw an exception that is caught just prior to returning to the client code, and throw an error there.

But wait – what about client code? I've said that if you get some weird input that results in bad math going into an allocator, then it _might_ be better to crash than to run arbitrary shell code, but the user experience still stinks. You've just improved things from disastrous to merely bad. Let's look at a concrete example – PowerPoint is all really nice exception-safe C++ code, and they're hard core about doing it right. Inside PowerPoint, allocators throw, and they handle errors where they make sense – typically by unwinding all the way back to where you can cleanly re-start the app, and not blow away other presentations you might be working on. PowerPoint, along with most of the rest of Office calls into mso.dll, which is mostly very much not exception safe. If it started throwing exceptions into say Excel, this would not be a good thing. Thus, that code has to check every allocation, properly clean up, and return an error.

The real kicker to all of this is that we're entering a really interesting 64-bit world. My current motherboard handles 8GB, and if past trends hold, in the next 10 years, I could have over 100GB RAM in my system. The software we write today could well be in use for 10 years. Some whiz-bang 3-d video-intensive app could very easily think that allocating 6GB was a perfectly fine thing to do, and if crashes on one system and runs on another, that's not a happy user experience. In the 64-bit world, we could have an allocation size that's absolutely correctly calculated, doesn't represent an attack, and it could quite easily fail on one system and not another. This isn't the sort of thing we want to crash over. I've got an implementation of vi written around 1993 that just exits if the file is bigger than about 1MB – seems really silly now. I do have a game installed at home that had a bug in it because some library got upset when it tried to use > 2GB for a video buffer. Stuff like that is going to be annoying when I build my next system in a few years and have 32GB or so.

Another issue is that crashes are really serious exploits to server code, even if the service restarts – the perf implications are horrible. I've seen instances where someone wanted to run what was considered client-side code on a server, and trying to get it to server level of quality was tough. You usually don't know when this might happen, so I'm kind of hard core about it – write solid code, and pick your poison – either don't use exceptions and check ALL your errors, or if you have the luxury of dealing with exception-safe C++ code, then do the work to handle exceptions in the right places, and use throwing new. Note that server code has been moving to 64-bit for some time.

Yet another reason to check these things, at least in existing code, is that we do often correctly handle out of memory conditions locally – "No, you cannot paste a 2GB image into this document, try something smaller" being one example. If I then add a check for int overflows in the same function, I don't want to in general introduce new error paths. What I can often do is make the int overflow show up as a bad alloc. For example:

Template<typename T>
size_t AllocSize(size_t elements)
{
if( elements > SIZET_MAX/sizeof(T) )
return ~(size_t)0; // can't ever be allocated

return elements * sizeof(T);
}

It then becomes really easy to guard against int overflows in that code base.

Something else that jogged my memory – I wrote here about a great presentation by Neel Mehta, John McDonald and Mark Dowd on finding exploits in C++ code. The thing is that there aren't many developer mistakes that don't lead to exploits one way or another. This is actually independent of language – there's things peculiar to C++ that can result in exploits, stuff that only C# can do, weird tricks with perl, and on and on. If you want to be a better developer, reading the Effective C++ series is a big help. If you want to be a better code auditor/tester, learn to be a better developer, and you'll be better at spotting programming flaws.

Checking Allocations & Potential for Int Mayhem

Must be synchronicity. I started out the day with a really interesting mail from Chris Wysopal talking about how allocations can go wrong, fun with signed int math, and the new[] operator. Once I got done responding to Chris, I then notice Robert Hensing's blog pointing me to Thomas Ptacek's comments about Mark Dowd's new exploit, found at http://www.matasano.com/log/1032/this-new-vulnerability-dowds-inhuman-flash-exploit/.

Some people were even saying that it makes no sense to check returns from malloc. I'm not sure what psychotropic substances were involved in that conclusion, but I don't want any. Let's look at all the ways this stuff can go wrong.

First, let's assume a non-throwing new (or new[]), or a malloc call. If you don't check the return and it fails, then we have roughly the following 3 categories, all of which are bad:

  1. No one adds anything user-controlled to the pointer, and it just gets used directly (or null + small, fixed offset). Your program crashes. If some other vulnerability exists, maybe you execute arbitrary code on the way out. I've written about this sort of thing before here and here. This is bad, customer is not happy, and if it is EVER used on a server, it is very bad.
  2. Someone adds a user-controlled value to the offset (and you didn't validate that, also bad). We now have an arbitrary memory write, and while Mr. Dowd is LOTS better at shell code than I want to be, making an exploit from one of these has the hard part mostly done.
  3. Next, and this is an interesting twist that's obvious once you think about it, is when someone then does this – ptr->array[offset], where offset is user controlled. Having ptr be null could actually even be an advantage, since you have an absolute starting point!

Now let's consider that we take the code and port it to 64-bit. On 32-bit, we can assume a 2GB alloc is going to fail. It has to. It's normal for an allocator to have code along these lines:

if( (ptrdiff_t)size < 0 ) fail_alloc;

Now go and change things to 64-bit, and a 2GB alloc might actually succeed, and there's actually no knowing what the upper bound might really be. You can now no longer say that 2GB ought to be enough for anybody, because it might not. Thus, in the non-throwing case, not checking returns is asking to get hacked, especially on 64-bit.

Next, let's consider the throwing case. This is something I was talking about with Richard van Eeden that's an example of bad code. He pointed this out:

Foo* pFoo;

try{ pFoo = new Foo;}

catch(…){delete pFoo;}

Obviously, there would be more to it than this, but what happens if new – or just as importantly the Foo constructor – throws, then we delete pFoo, which is uninitialized, and we should ALWAYS consider uninitialized variables to be attacker controlled. The problem here is that the programmer did not recognize the try-catch as a branch, and didn't ensure that pFoo was initialized everywhere. Since this is moderately subtle, it's worth checking for in an audit.

Now on to an interesting wrinkle – what about that throwing constructor? It's my preference not to do this if I can avoid it, but YMMV. So here's something interesting – now say someone wrote this:

class Foo

{

public:

Foo()

{

p1 = new Thing1;

p2 = new Thing2;

}

 

~Foo()

{

delete[] p1;

delete[] p2;

}

Thing1* p1;

Thing2* p2;

};

 

If the first allocation succeeds, and the second fails (or could be more complex constructors), then because the constructor never completed, the destructor never runs. This is thus dangerous code (especially if the first thing had global side-effects external to the app). There are two correct ways to fix it. The first is to properly use resource holders, like auto_ptr, and those will properly clean up even if the actual containing class destructor never runs. This is nicer, and you should _always_ use these when working with exceptions. The second way to make it work is to not do anything that can fail in the constructor, mark it throw(), and provide an init() method. Now the destructor has to run, things get cleaned up, even if init only makes it ½ way.

This leads me to my second main point, which is that if you're using throwing allocators, then you MUST write exception-safe code. If you do not, then something is going to get in a bad state when the stack unwinds, and it's quite often exploitable.

I hope I've established pretty firmly that not checking allocation returns is a very bad thing, and that not using proper exception safe code when throwing exceptions (assuming that you're not using the exception as a proxy for CatchFireAndBurn() ) is also a prescription for disaster.

Lastly, something else that's an interesting side-effect of the fact that the VS 2005 (and later) new[] implementation detects int overflows, we can actually use it in some respects to check math. So let's say I pass an int into new[]. First problem is that if the int is negative, the allocation is going to fail as being too large, even if sizeof(element) == 1. If it succeeds, we know the value is >= 0. Next, count * sizeof(element) has to not overflow. For this to be true, count <= INT_MAX/sizeof(element), also has to be true, and we may then be able to leverage that to avoid having to do redundant checks on count further along in the function IFF we checked our returns.

More Checking for Pointer Math

Someone pointed out that it isn't sufficient to check for whether the pointer math wrapped, but that we also need to check that the resulting pointer is in our buffer. They then came to the possibly erroneous conclusion that really all you had to do was to check whether the resulting index was in range. The real problem with this is that there's so many different scenarios that I don't see a one size fits all technique.

Depending on the code involved, it may not be sufficient to just know you're in the buffer. For example, bitmaps have 2 different ways to determine where to start drawing (scan0). Top left corner is easy – it's 0,0. Bottom left corner is harder – it's pixel 0 of the last line. So if I calculated scan0, and it ended up in some random place in the buffer, and then I started drawing up from there, I'd eventually start drawing my bitmap header and pieces of the heap and stack. This might lead to ways to overcome ASLR or other mischief.

Where the comment is absolutely correct is that if we're just doing a simple de-reference (get the nth element), then all we have to do is determine if n is somewhere in the array, and there's no need to do any complicated pointer math checking. If 0 <= n < max_elements, then we're in great shape. One would hope that this isn't the sort of code where we'd be checking if the pointer wrapped anyway. In other cases, I might want to do something more complicated, like calculate where to access some element given a starting point of a variable length header, and the element is in some structure following that header. To use a contrived example, figure out the pointer to the SID contained in the 5th ACE of a SECURITY_DESCRIPTOR's DACL – ick. This actually brings up a reasonable example – say someone stored a self-relative security descriptor in a file (Excel does this), and you've found out that passing a "malformed" security descriptor to random Windows APIs resulted in a non-exploitable crash. Now go write code to completely validate a security descriptor and all of the associated sub-structures. You have 4 possible elements following the header, and you have to check not only that each of them start in the buffer, but also that none of them go outside the buffer, and worse yet, that none of them overlap. This is quite tricky, since a DACL is a variable sized struct that contains a variable number of ACEs, which are also variable in size, and the ACEs contain SIDs, which is why they vary in size, and so on. Got to be quite a bit of code. The Windows APIs along the lines of IsValidSecurityDescriptor() are of very limited help in this area.

While there are some simple cases that can be done easily, it is a hard requirement that pointer math must be mathematically correct AND the result makes sense in terms of the buffer you're dealing with. And as we now know, we also have to worry about doing it in such a way that we don't run into undefined behavior according to the language.

Evil Compiler Tricks, and Checking for Pointer Math

My favorite programming geek hobby being integer overflows, this caught my eye –

"gcc silently discards some wraparound checks" http://www.kb.cert.org/vuls/id/162289

Basically, what it says is that code which looks like this:

============ snip ==============

        char *buf;
        int len;

gcc will assume that
buf+len >= buf.

As a result, code that performs length checks similar to the following:

len = 1<<30;
[...]
if(buf+len < buf)  /* length check */
  [...perform some manipulation on len...]


are compiled away by these versions of gcc

============ /snip ==============

Apparently, the compiler may be allowed by the C/C++ standard to do this, as pointers wrapping are not defined. However, the standard also says that optimizations may not change externally observable behavior in an application (aside from making it go faster), so I'm not so sure this is completely legal, but I Am Not A Standards Geek (IANASG), so I'll let them fight it out. The CERT advisory says to do this:

if((uintptr_t)buf+len < (uintptr_t)buf)

being alarmed by this problem, I then consulted what I do in SafeInt, which is:

template < typename T, typename U, typename E >

T*& operator +=( T*& lhs, SafeInt< U, E > rhs )

{

// Cast the pointer to a number so we can do arithmetic

SafeInt< uintptr_t, E > ptr_val = reinterpret_cast< uintptr_t >( lhs );

// Check first that rhs is valid for the type of ptrdiff_t

// and that multiplying by sizeof( T ) doesn't overflow a ptrdiff_t

// Next, we need to add 2 SafeInts of different types, so unbox the ptr_diff

// Finally, cast the number back to a pointer of the correct type

lhs = reinterpret_cast< T* >( (uintptr_t)( ptr_val + (ptrdiff_t)( SafeInt< ptrdiff_t, E >( rhs ) * sizeof( T ) ) ) );

return lhs;

}

So let me explain what exactly we're doing here, and why the CERT advice is bad (except where sizeof(T) == 1). First thing is to cast the pointer to an unsigned type that can always hold the bits in a pointer, and store that in a SafeInt. This happily coincides with the fact that unsigned int overflow behavior is actually specified. The second thing to do is to ensure that the offset multiplied by sizeof(T) does not overflow, and store the result in a ptrdiff_t (we could be moving backwards in the array), and finally check whether the addition ends up overflowing.

If you write code that only checks to see if the result of a pointer addition is greater than what you started with, your compiler might just remove the check, but what is much worse is that your code might not be working even if the compiler does not remove it. To write really correct code, you have to check BOTH the multiplication and the addition operations. As an aside, a problem I like to give to people as a puzzle is to write the correct code to determine if a + b * c yields a valid result or not. Surprisingly few people get it, but I think we can see here just how often we need to get exactly this operation right. Here's a trick that works for 32-bit unsigned code:

Unsigned __int64 result = (unsigned __int64)a + (unsigned __int64)b * (unsigned __int64)c;

If( (unsigned __int32)(result >> 32) )
return error;

Visual C++ Defenses and 64-bit

Michael Howard just published a good article here on how Visual C++ features can help protect your app. I go into a fair bit more detail on these in our most recent book, "Writing Secure Code for Windows Vista" (WSCV) if you're curious. Something Michael left out of the article was how this changes when you go to 64-bit – to be completely accurate, I'd need to go sit down with my compiler and debugger and go look at the actual code that's emitted – which I don't have time for this evening – but I'll take a shot anyway, and if I get something wrong, I'm sure someone will correct me.

  • Stack Overrun detection – by and large the same stuff as on x86, but with one interesting twist. There's a lot more registers available in x64, which is part of why some code runs faster on x64, and one of the things that changes is that x64 uses fastcall as the default calling convention. What this does on x86 is that it puts the first argument into a register rather than pushing it on the stack. If it's in a register, it can't be overrun as easily. On x64, it puts the first four arguments in registers. So this largely removes a fairly significant attack vector, and means that most functions won't need duplicate arguments in 2 places on the stack. It's also arguably a little safer, since a buffer _underrun_ might still get the x86 code, but won't typically attack the argument on x64.
  • SafeSEH – IIRC, the exception handlers get compiled in, and aren't loitering on the stack to see if they'll get mugged.
  • NXCompat – everyone gets this, like it or not. Much safer that way.
  • DynamicBase (ASLR) – I don't think there's any changes here.

The CRT is still the CRT, so no difference there, and new is the same thing as before. Something else that we do that I learned from John McDonald is that when you call delete[], the destructor needs to get called for every object in the array. Some other compilers will run through the whole array, calling the destructor for each object in turn, referencing what that object claims is its destructor. Thus if someone can whack a vtable for just one of them, off to the arbitrary code races we go. Visual C++ will just call the destructor function for the first one, and it's less likely that an attacker can get to that one vtable – this ends up being safer.

Use of ASLR, NX, etc

Found a really great post by David Maynor here. He points out that various counter-measures aren't always used by apps other than Windows. I would have commented directly to his blog, but didn't feel like signing up, so I'll make some comments here –

David Maynor said:

A good example of this is Apple. Their QuickTime application (part of iTunes) is installed on a lot of Windows computers, if not most. Yet, there is a new vulnerability discovered in QuickTime every couple months because the code is inherently insecure. The vulnerabilities allow attackers to break into Vista machines because Apple doesn't take advantage of Vista's security features.

It doesn't have to be that way. One of the most important Vista security features is "address space randomization" or "ASLR". […]

I wouldn't tend to phrase it quite that strongly myself, but what I've been told is the case is that Apple requires compiling with gcc. This is something that makes things more difficult for me, since gcc's support for templates isn't the best (about the same as Visual C++ 6), and SafeInt - despite being purely standards-compliant C++ and not platform-specific – won't compile. So what I suspect might be going on with QuickTime is that it gets compiled with gcc for all platforms, and I'm guessing that gcc won't set the ASLR bit. If my guess is right, it would be a Good Thing for someone to go add that.

[update] I've been informed that I'm wrong and that QuickTime is using the Visual Studio compiler, and should be flipping the ASLR bit in upcoming releases. My apologies.

Something else David said:

Turning on the ASLR flag in all products will undoubtedly cause (or expose) a few bugs, but most software will run just fine. There is no reason for software companies to continue to ignore this issue. Among the companies/products currently ignoring these features are: Mozilla's Firefox, Google's toolbar, Apple's iTunes, Adobe's PDF reader, Roxio's media creation tools, and Divx's player. Actually, we haven't found any company that turns on ASLR consistently.

We flipped the ASLR bit in Office 2007 very late in the development process, and we did this with great trepidation. Office has a lot of really complex code, and generally even a small change is going to have some side-effect. What we found broke was testing harnesses that had hard-coded app addresses. There was not a single break in Office itself. Anyone can flip the ASLR bit with very little fear of regressions. From our experience with a huge, varied code-base, I'd expect ASLR to cause bugs in extremely rare cases – unless it's an exploit.

Note that the same is not true of NX – this can take some work, depending on what your app does. If you have an ordinary app, it might not cause any problems. Something else to consider as you port to 64-bit is that you get NX enabled whether you like it or not, so you may as well figure out what needs to get fixed now. This is one reason I prefer running a 64-bit OS – some of the exploits aren't going to work until people start building x64 exploits.

He also said:

In the screenshot, we scanned the Microsoft Office subdirectory on a Vista machine. We find that while a lot of code supports ASLR, a lot of code doesn't. We likewise see that only some code supports the NX bit. We also see that Microsoft Office still uses dangerous functions like "strcpy". As it turns out, most of Microsoft Office is secure, but that code supporting legacy features (such as older file formats) is where it's insecure.

Anything we built in Office 2007 should have ASLR – if it doesn't have ASLR, then we likely purchased it from a 3rd party. In the future, we'll make sure ASLR, etc. are enabled for everything. I'd also point out that just because you see strcpy present doesn't always mean there's an exploit. Copying a fixed string into a fixed buffer is safe. We actually swept the code to remove these back in Office 2003. It's been noted that Vista has removed these and used SAL – what's less known is that we did this in Office 2003 and the low regression rate and positive results we got from it ended up in the SDL and Vista. Obviously, if David's finding these, some must have either been missed, or we decided that instance was safe. I'd like to reinforce one of his points – some of the code that supports older formats, especially formats we didn't produce, is scary stuff to try and go change, and we don't think it can easily be made as secure as either more modern code, or even older code that we wrote for our own formats. This is one of the reasons we've disabled some of these in Office 2007 and Office 2003 SP3. It's interesting to see a noted researcher come to the same conclusion.

I'm also going to make a counter-argument I think David might agree with – ASLR, NX and other techniques work as great safety devices to make flaws less likely to be an exploitable flaw, as opposed to an ordinary flaw. You should _never_ write code that assumes flaws are OK. Solid code tends to be secure code.

I'll try to post more later – been busy lately.

DLL Preloading Attacks

A DLL preloading attack is something that can get you on a lot of different platforms. One of the first variants I heard about was in an ancient telnet daemon on certain versions of UNIX where you could specify environment variables, and one of the things you could specify was where to look for libraries. Obviously, if you could get the telnet daemon running as root to load your library, it was then your system.

A difference between UNIX-ish systems and systems based on DOS is that the current directory "." is not on the search path for UNIX-ish systems, and it is for DOS systems, which didn't have different users, so there was no need to worry about some of these things. Originally, a Windows system would look for DLLs using the same ordering that you'd look for an executable – as documented in the SearchPath API:

The directory from which the application loaded.
The current directory.
The system directory. Use the GetSystemDirectory function to get the path of this directory.
The 16-bit system directory. There is no function that retrieves the path of this directory, but it is searched.
The Windows directory. Use the GetWindowsDirectory function to get the path of this directory.
The directories that are listed in the PATH environment variable.

The attack is that you find some DLL an app needs, make an evil twin, and put it in the same directory as a document, then lure someone who you'd like to have running your code to open the document. This is obviously a problem, and the advice we gave in Writing Secure Code (1&2) was to fully path the library you wanted to access with LoadLibrary. This advice isn't always the best, since if you weren't sure where you were installed, you might use SearchPath to go find it, which looks in the current directory, and now you have a problem again.

What we did to fix it correctly was to make a setting that moved the current directory into the search order immediately before the path is searched, and after everything else. This took effect by default in XP SP2, Win2k3 and later, and was available in Win2k SP4. For the most part, this did get rid of the problem – if it was a DLL in the operating system, that got searched well before the current directory and all was good.

Unfortunately, this isn't a complete fix in all cases – there are some times that we'd like to test to see if a DLL is present, and then do something special if it is. Even with the current directory moved to the end of the search order, if it isn't there, we'll still look in the current directory. So code that looks like this:

hMod = LoadLibrary("Foo.dll"); // check to see if Foo is present

Will be dangerous. You have a couple of good options in dealing with this. If you never have a need to load a DLL from the current directory, just call SetDllDirectory with an argument of "". This is something I discovered by playing with the API, went and looked at the code and found that it was an intended use of the function, logged a bug, and now it's documented behavior you can depend on. If you can do this, it's best – you don't have to put a lot of overhead around every LoadLibrary call, and you're safe. The API is available in XP SP1 and later, which is pretty safe as a minimum platform these days. A second approach that would involve a bit of work on your part would be to implement only the bits of SearchPath that you need. Here's what I'd do:

  1. Search your app's directory – you can find this with GetModuleFileName using NULL as the first parameter.
  2. Look in the system directory as above
  3. Look in the Windows directory as above

There could be some wrinkles around side-by-side DLL's, and I haven't looked closely at this aspect of the problem – perhaps someone who has could comment. An option that I'd tend to discourage would be to load the library with LOAD_LIBRARY_AS_DATAFILE as a flag, then using GetModuleFileName to see if it is the one you wanted, or checking somehow to see if it is the one you wanted using some form of checksum. The first problem is that this is a lot of overhead, and the second is that if there's a path to parse, odds are you'll do something wrong and break when the format of the return changes, or you'll foul up and make the wrong decision, since making decisions based on names is hard. Checksums are easily defeated, and real cryptography is computationally expensive.

I've been meaning to write this for a while, as it's one of the portions of Writing Secure Code that I'd like to update -

Terminating your app on heap corruption

Michael Howard has a FAQ on this here – there's also more information on this and related defenses in one of my chapters in Writing Secure Code for Windows Vista. One of the things I'd like to point out about enabling this, and several other defenses, like NX, SafeSEH and some others, is that you get them by default in 64-bit code. I'd suspect that for many of you, you'll get more testing done on 32-bit in the near term, and by turning these on, you're narrowing the differences between the two platforms. While I seriously doubt that setting HeapEnableTerminationOnCorruption is going to cause any regressions, you want to maximize your chances of finding these.

There's one question not answered in Michael's FAQ that this inquiring mind would like to know – just exactly what triggers this? I suppose I should go look at the source and see where it branches, but in the sample apps I wrote for WSCV, I couldn't find any that wouldn't abort without this flag set. If someone has some info on this that I can post, please send it.

It was interesting – the heap management in Vista is much, much more robust than previous versions. For example, you can malloc 3 buffers right in a row, then use the first pointer to just trash all 3 buffers. On XP and Win2k3, the app just trundles right along until you try to free the trashed buffers, and then it dies in a possibly exploitable manner. On Vista, as soon as you try to free the first buffer, the app is in the exception handler. Even though I'm not sure how much extra HeapEnableTerminationOnCorruption does for you, it's still a good thing to have set.

Speaking of which, here's a handy way to do this on cross-platform code without making unneeded API calls:

#if !defined (_WIN64) && _WIN32_WINNT >= 0x0600

void SetHeapTerminate(HANDLE HeapHandle)

{

    (void)HeapSetInformation( HeapHandle, HeapEnableTerminationOnCorruption, NULL, 0 );

}

#else // 64-bit or prior to Vista

void SetHeapTerminate(HANDLE HeapHandle)

{

    // Avoid W4 warnings -

    HeapHandle;

}

#endif

HD vs. Blu-ray (2)

I promise I'll get back to security stuff shortly, but over the weekend I ran into a couple of articles that explain the issues a lot better. So HD-DVD is quite likely going the way of the 8-track – no need to fight the tide (and no, I have no internal info on this at all – I work on Office, not Xbox). Now the real question is what to do next, and when. My options in terms of getting 1080p content in the future are limited to Blu-ray, and downloads. I don't have enough bandwidth at home to make downloading multiple gigs worth of movies practical, so I'm stuck with the 1080i content I can record and watch coming from the dish – which isn't bad at all.

As far as Blu-ray goes, we all know that timing is everything when it comes to techno toys. It really stinks to buy something and then have something much better come out next week at the same price. Or you can wait forever and not enjoy the technology. But Blu-ray has some interesting issues just at the moment – seems that it was rushed out without a fully developed spec (seems this happens a lot, and we're not without sin in this area), and that it's shifting rapidly at the moment. An article at Audioholics sums it up – basically, there's the 1.0 spec, the 1.1 spec, and the 2.0 spec, each with associated marketing buzzwords to confuse us. Basically, one of the reasons I'm most annoyed that Blu-ray won out is that I already own an HD-DVD player that's equivalent to Blu-ray 2.0 spec, which I can't buy yet >8-(

Video Business has an article here that explains the issue in terms of which available players support which spec – for example, the Samsung BD-P1400 can be had for a reasonable cost right now, but it only supports the 1.0 spec, can't be upgraded, and it isn't guaranteed that you can play all future movies with it – there have been issues with this already. Buying a 1.0 spec player seems like a Bad Thing™. Personally, waiting for a 2.0 spec player seems like the right decision – hopefully, the spec will have settled down and there won't be a 2.1 or 3.0 anytime very soon. However, it seems like you can't buy any of these at the moment. Options there are to get a PS3 – but I already own an Xbox, so that's expensive when I already have more games than time, or there's a couple of players coming out RSN, like the Panasonic DMP-BD50 – looks nice, but you can't buy one now, and everyone seems to think it will be expensive (~$700). There are a couple of 1.1 spec players that are upgradable, but these are currently $629 at Amazon. Still too much. I can almost build a media center PC with a Blu-ray reader for that price, which would do a lot more and can be upgraded when I feel like it.

Looks like the 2.0 spec will start getting common around the summer, but I expect them to remain expensive for a while. I think I'll add an external hard drive to my DVR, and watch movies off the dish until this settles out and prices drop – and the Xbox does a _great_ job of up converting. My prediction – I predict that they'll find that while the format war didn't help adoption that the biggest blocker is going to be price. I'm also looking forward to when there's enough bandwidth commonly available that I can just download these things. Why worry about scratched discs when I can just stick stuff on a media center box, and pipe it anywhere in my house I like?

HD vs. Blu-Ray

OK, so this isn't security related at all, just felt like grumbling about the latest development. If you're not interested in my thoughts on this, skip it now.

A few years ago, I remodeled my basement, and took an odd room with only one window and wired it for home theater. It gathered dust until this last November, when I finally decided that HD TV had come down enough to be reasonable. The whole thing was quite an adventure of figuring out what to buy, then where to buy it. For example, Tripp-Lite makes some really good power conditioning stuff that is as good as the Monster product, and ½ the price. Also managed to buy the monitor for 30% off list, which was a good trick. It became like a 2nd job, and I've thought more than once that hiring someone to do this may have emptied my wallet much more quickly, but maybe worth the headaches.

So this then put me in the quandary of how to feed it 1080p inputs – just a few months ago, HD-DVD vs. Blu-Ray was a fairly evenly matched fight. I naively thought that it was pretty well the same feature set, just some differences in storage formats. Not being sure which would win, I went for the cheaper bet – I was going to buy an Xbox 360 anyway, so another $150 getting me HD-DVD seemed like a great way to hedge bets. Great picture, interactive stuff, good times for geek toys.

Now comes along Warner ditching HD, which prodded me to investigate the situation a little more. Seems that Blu-Ray costs more and does less. Feh. So what that you can put more on the disk? What's there does less… To make matters even worse, they of course have plans to upgrade to be just as technically good as HD-DVD is now, and the most recent players (in the $500 range) are starting to do that. This means anything I feel like buying now is going to be either obsolete or ½ (or less) the price within a year. If this were just a matter of spending $200 to go get something about as good as what I have now in order to play the other format, no problem. Spending twice that so I can be not as good as what I have now, and then want to buy it again in 6-12 months is too much to spend for it to be obsolete that fast.

They're wondering why we're not adopting this stuff – maybe it isn't so much a problem with dual format – after all, I had VHS and DVD co-exist for years, and vinyl, cassette and CD co-exist for longer than that – and maybe it's a problem with the price point being just too high. And now that all the bills from this little consumer adventure have come home to roost, that seems like something to pay attention to.

While the picture from the Dish is only 1080i, it's pretty nice, and now that it looks like Blu-Ray is really going to win, I suppose I'll wait until the feature set stabilizes and the prices drop – and just watch things on DVR, which doesn't cost anything more than I'm spending right now. My bet is that by the time the next Great American Buying Frenzy starts again, it will be reasonable. I'm really looking forward to the day that I can just download this stuff – Verizon claims they're laying fiber to everywhere (good luck getting it to the boonies, like my house), but that will get us out of the whole problem of needing individual physical media at roughly $1/GB when disc storage costs me $0.20/GB – and to heck with time-shifting things – if I want to watch episode 52 of Star Trek, I can just go do that – I can do this to some extent with the Xbox now.

I'll be back with more secure programming thoughts next week. I have some thoughts on the whole LoadLibrary search issue – our information in WSC2 is out of date, and some cool template code to make sure your allocation sizes are correct.

15 Most Influential Security People

This isn't exactly the list I would have drawn up, and I must be having a bad year, since I'm not on it <g>, but my friend Michael Howard is on the list. You can check it out here:

http://www.eweek.com/c/a/Security/The-15-Most-Influential-People-in-Security-Today/

My personal list would be a bit different, but this one is pretty good. I won't call most of them out here, since most of them tend to avoid publicity, but here's some people to consider:

  1. One of the smartest security guys I know – once found a very large number of security bugs in Windows with "notepad and me brain". He's done as much to improve kernel level security as anyone I know.
  2. A quiet security architect in Windows who has been responsible for huge numbers of improvements, like the impersonation privilege that shut down a whole class of attacks.
  3. Another quiet security architect who rarely makes noise in public who's been responsible for shutting down vast numbers of information disclosure leaks in Windows.
  4. A PowerPoint dev manager who deserves huge credit for driving up code quality levels, not only in his own product, but across Office and Microsoft.
  5. A couple of quiet guys in Excel who really get security and deserve a lot of credit for making their app better.
  6. Another very quiet security guy who came up with LUA and integrity levels – not perfect, but it's a huge improvement.
  7. The tester in Access that enabled us to do massively distributed fuzzing.
  8. The IIS team for going from a mess in IIS 5 to a truly stellar record in IIS 6
  9. Same thing for SQL – used to be a security mess, now it's really solid – and thanks to NGS for helping

10-10,000 or so – all those people in the code every day who really get it and strive to deliver secure products, no matter where they work. I've left off a lot of people, but my main point is that the people that matter the most aren't always the most visible, and some of the people that are most visible aren't doing a whole lot to really help users – and that's what matters. There's some that manage to do both.

This brings me back to a thought I had while reading this post to the SDL blog –

http://blogs.msdn.com/sdl/archive/2008/02/04/more-trustworthy-election-systems-via-sdl.aspx

It's pretty astonishing how badly they've fouled up something this important, and I agree that the elements of the SDL could have helped, but the really important missing ingredient is people in the trenches that really care about security, and management that sets this as a priority. Without that intangible, you can have all the SDL you want, and it won't matter. With those people who truly have their mind in the game and understand that quality must imply security, the SDL becomes a checklist to make sure you didn't forget anything. It's those people in the code day in and day out who I think have the most influence – and you'll never see most of them in public.

More Posts Next page »
Page view tracker