I was reading this months issue of Dr. Dobbs Journal, and I ran into the column "Illusions of Safety" by Pete Becker. Pete writes about enhancements to the C language, and I usually really enjoy his columns.
This month was an exception. Pete basically spent the entire column explaining why you don't have to worry about the "unsafe" C runtime library functions like gets() and strcpy() as long as you design your application correctly.
I'm just fine until he gets to the 2nd page of the article, "Prevention by Brute Force". First off, he presents a "safe" version of strcpy called safe_strcpy which isn't actually a safe version of strcpy. It's actually a replacement for strncpy, and preserves one of the critical bugs in strncpy - strncpy isn't guaranteed to null terminate the output string, and neither is his "safe" replacement. He also describes testing for success on the strcpy as "tedious". Yeah, I guess that ensuring that you handle the failure of API's is "tedious". I'd also describe it as "correct".
But it's when we get to the 3rd page "Prevention by Design" that Pete totally loses me.
You see, on Page 3, Pete decides that it's OK to use strcpy. You just need to make sure that you put an assert() to make sure it doesn't overflow. And you need to make sure that the inputs to your functions are checked upon entry to the system.
But that's just plain wrong. First off, the assert() won't be there in non debug code - assert()'s disappear in production code. So your production code won't have the protection of the assert. And it completely ignores the REAL cause of the problem - what happens when the vulnerable function is called from an unchecked code path. If (when) that happens, you've got a security hole. And the bad guys ARE going to find it. Michael Howard gives LOTS of examples where developers added checks to a vulnerable code path at the entry point of a function without realizing that there was another code path to the vulnerable function. You also don't know what will happen four or five years in the future - it may be that a future maintainer of your code won't realize that your low level routine has such a constraint and calls it improperly.
It's far better to replace the strcpy() call with a strcpy_s() call and make the caller pass in the size of the target buffer. That way you don't rely on others to protect your vulnerability. There is at least one known vulnerability that was caused by someone taking Pete's advice. A development organization had a vulnerability reported to them, and they fixed it by adding a check up-front to cut off the vulnerability. Since they'd removed the vulnerability, they released a patch containing the fix. Unfortunately, they didn't realize that there was another path to the vulnerable function in their code, and they had to re-release the patch. This time they did what they should have done in the first place and not only did they add the up-front test, they also removed the vulnerable call to strcpy - that way they'll never be caught by that vulnerable call again.
On page 4, he claims that it IS possible to use gets() correctly. It's ok to have programs that read from the console as long as those programs are only called from other, known programs. But of course, this totally ignores the reality of how the bad guys work. Security vulnerabilities are rarely exposed by someone using an API or protocol in the way it was intended. They almost always occur because someone DIDN'T use a program in the way it was intended. For instance, I know there was at least one *nix vulnerability that was explicitly caused by a setuid application calling gets. Someone fed a bogus command line string to that application and presto! they had root access to the machine.
Here's another way of thinking about it (oh no, not another stupid automotive analogy). If you drive safely, you don't need seat belts (or airbags). But I wouldn't even DREAM of riding in a car without at least seat belts. Why? Seatbelts and airbags are a form of defense in depth. Your primary defense is safe driving. But just in case, your seat belts (or airbags or both) may save your life.
Bottom line: A good architecture is no substitute for Defense in Depth. Sure, apply the architectural changes that Pete described. They're great. But don't believe that they are adequate to ensure safety.
Glad I'm not the only one who was confounded by Pete's column. When he started into using gets(), I just gave up and assumed that it was a mis-dated April fool column.
Pete's statement about not worrying about gets() buffer overflows in a "suite of applications... with the assurance that designed-in assumptions about line lenghts are valid" and "...not something I stay up nights worrying about: My computer sits on my desk, and no malicious user can sneak in during the night and modify my saved files." strikes me as silly. If you're talking about a program you wrote to catalog your CD's or somesuch, that's fine. A program that's going to be running on computers that are not "your computer" can't act that way, though. Someone else _is_ going to stay up nights worrying about it-- or, rather, looking for it.
It seems like good practice to me to have every function check the validity of the caller's arguments, and to check the status returned by any functions it calls. The only time I'd deviate from this design would be when profiling shows a clear performance win. I'd change the function to perform validation "up front", and remove per-iteration validation. Of course, you must guarantee that there are no paths of execution that would bypass the "up front" validation, and as such, I'd consider the removal of parameter validation from a called function as grounds to inline that function in the caller, just to avoid a potential path of execution that lacked validation.
I think that every function checking the validity of its arguments is probably overkill (it gets into the "IsBadXxxPtr" problem). If the function is internal, then assert() is fine.
But assert() is not and can not be a substitute for defensive programming.
You hit the nail on the head. And it's an extremely wide-spread problem. And, as you say, it's common to simply fall-back to "just design it correctly" and use of str*() is fine. If that were the case we wouldn't have security issues with software.
The scary part is; there's 3rd party libraries modeled around the CRT--rife with unsized pointer arguments, meaning there's no possible way it can reliably ensure it's safe.
e.g. This should be a new paragraph (using <p>).
Yeah, I know. We recently switched the site to CS 2.1 and this apparently is a side effect. I've complained to the appropriate people.
Sergei: It's possible to use strcpy safely. But you need to first verify that the source string being copied is shorter than the destination buffer, which requires traversing the string twice (once to find the length, the 2nd time to copy the data). But why NOT use strcpy_s which does both at the same time?
Otherwise you're trusting your callers.
I've had MANY situations where developers made a change, tested it on the retail bits and checked in the change. Only later, when someone was testing the debug bits did they realize that the other developer caused an assert to fire. That's because asserts are only used to validate invariants. Asserts are fine, but since they're not always present, they aren't a suffcient protection.
People keep on trying to give arguments why they should use broken paradigms. Why? There are replacements that have equivilant performance characteristics that are safe, why not use them?
>one of the critical bugs in strncpy - strncpy isn't guaranteed to null terminate the output string
If it did there would be no way to detect the case when the destination is too small to hold the whole source.
sergei: Interesting. I've never heard of that usage for strncpy (checking the byte at <length> for 0). I much prefer APIs that return errors when they fail as opposed to having to probe for side-effects.