Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Sometimes people don't really get the point of defensive programming for security...

Sometimes people don't really get the point of defensive programming for security...

  • Comments 23

I was reading this months issue of Dr. Dobbs Journal, and I ran into the column "Illusions of Safety" by Pete Becker.  Pete writes about enhancements to the C language, and I usually really enjoy his columns.

This month was an exception.  Pete basically spent the entire column explaining why you don't have to worry about the "unsafe" C runtime library functions like gets() and strcpy() as long as you design your application correctly.

I'm just fine until he gets to the 2nd page of the article, "Prevention by Brute Force".  First off, he presents a "safe" version of strcpy called safe_strcpy which isn't actually a safe version of strcpy.  It's actually a replacement for strncpy, and preserves one of the critical bugs in strncpy - strncpy isn't guaranteed to null terminate the output string, and neither is his "safe" replacement.  He also describes testing for success on the strcpy as "tedious".  Yeah, I guess that ensuring that you handle the failure of API's is "tedious".  I'd also describe it as "correct".

But it's when we get to the 3rd page "Prevention by Design" that Pete totally loses me.

You see, on Page 3, Pete decides that it's OK to use strcpy.  You just need to make sure that you put an assert() to make sure it doesn't overflow.  And you need to make sure that the inputs to your functions are checked upon entry to the system.

But that's just plain wrong.  First off, the assert() won't be there in non debug code - assert()'s disappear in production code.  So your production code won't have the protection of the assert.  And it completely ignores the REAL cause of the problem - what happens when the vulnerable function is called from an unchecked code path.  If (when) that happens, you've got a security hole.  And the bad guys ARE going to find it.  Michael Howard gives LOTS of examples where developers added checks to a vulnerable code path at the entry point of a function without realizing that there was another code path to the vulnerable function.  You also don't know what will happen four or five years in the future - it may be that a future maintainer of your code won't realize that your low level routine has such a constraint and calls it improperly. 

It's far better to replace the strcpy() call with a strcpy_s() call and make the caller pass in the size of the target buffer.  That way you don't rely on others to protect your vulnerability.  There is at least one known vulnerability that was caused by someone taking Pete's advice.  A development organization had a vulnerability reported to them, and they fixed it by adding a check up-front to cut off the vulnerability.  Since they'd removed the vulnerability, they released a patch containing the fix.  Unfortunately, they didn't realize that there was another path to the vulnerable function in their code, and they had to re-release the patch.  This time they did what they should have done in the first place and not only did they add the up-front test, they also removed the vulnerable call to strcpy - that way they'll never be caught by that vulnerable call again.

On page 4, he claims that it IS possible to use gets() correctly.  It's ok to have programs that read from the console as long as those programs are only called from other, known programs.  But of course, this totally ignores the reality of how the bad guys work.  Security vulnerabilities are rarely exposed by someone using an API or protocol in the way it was intended.  They almost always occur because someone DIDN'T use a program in the way it was intended.  For instance, I know there was at least one *nix vulnerability that was explicitly caused by a setuid application calling gets.  Someone fed a bogus command line string to that application and presto! they had root access to the machine.

Here's another way of thinking about it (oh no, not another stupid automotive analogy).  If you drive safely, you don't need seat belts (or airbags).  But I wouldn't even DREAM of riding in a car without at least seat belts.  Why?  Seatbelts and airbags are a form of defense in depth.  Your primary defense is safe driving.  But just in case, your seat belts (or airbags or both) may save your life.

 

Bottom line: A good architecture is no substitute for Defense in Depth.  Sure, apply the architectural changes that Pete described.  They're great.  But don't believe that they are adequate to ensure safety. 

  • I think the real point of the article is being missed here. Surely everyone here would agree that formal proof of buffer overflow impossibility through static analysis and design is a MUCH better way to produce secure software than runtime buffer size checks? Just checking string bounds isn't always enough -- you have to make sure that a truncated string isn't used in a way that could cause a different vulnerability, or that throwing exceptions doesn't allow an easy way to cause a DoS, etc. I don't think bounds checking is a bad idea, as it definitely helps with robustness as well as security (think hardware errors!), but it's not a _great_ solution. For calls that are solely internal, it's a bit like speculatively adding try/catch. Personally, I would rather see more emphasis on language features and tools that help me declare and enforce guarantees at compile time, because that would help avoid a lot more problems than just buffer overflows. Also, I think the performance issue is a bit of a red herring, not because the checks are "free" -- they aren't, either in runtime or code size cost -- but because string operations themselves aren't cheap, and overuse of strings is a common performance problem.
  • Friday, October 06, 2006 3:03 AM by Phaeron

    > formal proof of buffer overflow impossibility through static

    > analysis and design is a MUCH better way to produce secure

    > software than runtime buffer size checks?

    That is 100% true.

    For some kinds of programs you can do a formal proof.  How do you construct a formal proof?  In my experience it was done by humans applying logic to demonstrate that each fact is provable from previously proven facts.  In other words, humans were applying the same logic that humans apply in writing programs.  So the bug rate wasn't zero, the bug rate might potentially be geometrically lower (two simultaneous failures being geometrically less probable than one failure), but the actual results weren't that good.

    In the general case there won't always be even a theoretical possibility of a formal proof.  It reduces to the halting problem.

    Now, C strings are miserable.  Most of the C language combines the power of assembly language with the beauty of assembly language because it was invented as a substitute for assembly language.  Of course since being standardized it's not always possible to use it that way any more, but it still reflects the beauty of its origins (^_irony).  The strings section of the library seems to have been designed more hastily.  It took advantage of machine instructions which an assembly language programmer could code quickly for one kind of machine.  It didn't prepare for efficient but harder coding that would measure once before copying and concatenating and comparing several times.  C++, being object-oriented assembly assembly, derives with the same weak base, though it's a bit easier to make counted string libraries that will be easily usable by clients.

    A few days ago I used _tcscpy safely.  I ran a few _tsclen calls, added the results, did a malloc, checked the result of malloc, and then if still running I copied each string to the appropriate portion of the resulting buffer.  Then I cringed at the thought of what will happen when someone copies and pastes the code without figuring out why it was OK this once.  I wished I could just use VB's & operator instead.

  • The real solution is to use C++

  • > The real solution is to use C++

    Why stop here and not switch to C# or VB?

  • As I understand it, the main argument against to "Why stop here [C++] and not switch to C# or VB[.NET]?" is that C++ is still a LOT faster running...like 16 to 64 times faster...mostly due to the asbsence of strict variable typing and the direct access it affords to memory.  

    That's how I used to think even just a few years ago, and I still do, to the extent that I think unmanaged code is still a hellalot better for, say, massive scientific simulations and the like, because it's still WAY WAY WAY faster, deny it all you like.

    However, for pretty much any business-related project I could imagine, and most anything else non-scientific (or gaming), the hardware is now simply so fast that: who cares how much slower managed code executes?...it's just not generally an issue that it runs many times slower.  So, in the final analysis, I would never use C or even C++ for anything usual, especially if I needed the result to securly withstand public use.

  • "A few days ago I used _tcscpy safely.  I ran a few _tsclen calls, added the results, did a malloc, checked the result of malloc, and then if still running I copied each string to the appropriate portion of the resulting buffer."

    Did you remember to check to for overflow on your addition?

  • He's not claiming that you should use assert(3) for parameter validation; indeed in that bit of the article he explicitly states that he wants to avoid doing parameter validation everywhere on efficiency grounds(!). The assert is just an expression of the "contract" of the function. Sure, the suggestions are pretty silly, but not really in the way you suggest, at least in that case.

  • If C# and C++ are too slow, then switch to Java.

    I switched and never looked back.

Page 2 of 2 (23 items) 12