Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

Sometimes people don't really get the point of defensive programming for security...

Sometimes people don't really get the point of defensive programming for security...

  • Comments 23

I was reading this months issue of Dr. Dobbs Journal, and I ran into the column "Illusions of Safety" by Pete Becker.  Pete writes about enhancements to the C language, and I usually really enjoy his columns.

This month was an exception.  Pete basically spent the entire column explaining why you don't have to worry about the "unsafe" C runtime library functions like gets() and strcpy() as long as you design your application correctly.

I'm just fine until he gets to the 2nd page of the article, "Prevention by Brute Force".  First off, he presents a "safe" version of strcpy called safe_strcpy which isn't actually a safe version of strcpy.  It's actually a replacement for strncpy, and preserves one of the critical bugs in strncpy - strncpy isn't guaranteed to null terminate the output string, and neither is his "safe" replacement.  He also describes testing for success on the strcpy as "tedious".  Yeah, I guess that ensuring that you handle the failure of API's is "tedious".  I'd also describe it as "correct".

But it's when we get to the 3rd page "Prevention by Design" that Pete totally loses me.

You see, on Page 3, Pete decides that it's OK to use strcpy.  You just need to make sure that you put an assert() to make sure it doesn't overflow.  And you need to make sure that the inputs to your functions are checked upon entry to the system.

But that's just plain wrong.  First off, the assert() won't be there in non debug code - assert()'s disappear in production code.  So your production code won't have the protection of the assert.  And it completely ignores the REAL cause of the problem - what happens when the vulnerable function is called from an unchecked code path.  If (when) that happens, you've got a security hole.  And the bad guys ARE going to find it.  Michael Howard gives LOTS of examples where developers added checks to a vulnerable code path at the entry point of a function without realizing that there was another code path to the vulnerable function.  You also don't know what will happen four or five years in the future - it may be that a future maintainer of your code won't realize that your low level routine has such a constraint and calls it improperly. 

It's far better to replace the strcpy() call with a strcpy_s() call and make the caller pass in the size of the target buffer.  That way you don't rely on others to protect your vulnerability.  There is at least one known vulnerability that was caused by someone taking Pete's advice.  A development organization had a vulnerability reported to them, and they fixed it by adding a check up-front to cut off the vulnerability.  Since they'd removed the vulnerability, they released a patch containing the fix.  Unfortunately, they didn't realize that there was another path to the vulnerable function in their code, and they had to re-release the patch.  This time they did what they should have done in the first place and not only did they add the up-front test, they also removed the vulnerable call to strcpy - that way they'll never be caught by that vulnerable call again.

On page 4, he claims that it IS possible to use gets() correctly.  It's ok to have programs that read from the console as long as those programs are only called from other, known programs.  But of course, this totally ignores the reality of how the bad guys work.  Security vulnerabilities are rarely exposed by someone using an API or protocol in the way it was intended.  They almost always occur because someone DIDN'T use a program in the way it was intended.  For instance, I know there was at least one *nix vulnerability that was explicitly caused by a setuid application calling gets.  Someone fed a bogus command line string to that application and presto! they had root access to the machine.

Here's another way of thinking about it (oh no, not another stupid automotive analogy).  If you drive safely, you don't need seat belts (or airbags).  But I wouldn't even DREAM of riding in a car without at least seat belts.  Why?  Seatbelts and airbags are a form of defense in depth.  Your primary defense is safe driving.  But just in case, your seat belts (or airbags or both) may save your life.

 

Bottom line: A good architecture is no substitute for Defense in Depth.  Sure, apply the architectural changes that Pete described.  They're great.  But don't believe that they are adequate to ensure safety. 

  • Glad I'm not the only one who was confounded by Pete's column. When he started into using gets(), I just gave up and assumed that it was a mis-dated April fool column.

    Pete's statement about not worrying about gets() buffer overflows in a "suite of applications... with the assurance that designed-in assumptions about line lenghts are valid" and "...not something I stay up nights worrying about: My computer sits on my desk, and no malicious user can sneak in during the night and modify my saved files." strikes me as silly. If you're talking about a program you wrote to catalog your CD's or somesuch, that's fine. A program that's going to be running on computers that are not "your computer" can't act that way, though. Someone else _is_ going to stay up nights worrying about it-- or, rather, looking for it.

    It seems like good practice to me to have every function check the validity of the caller's arguments, and to check the status returned by any functions it calls. The only time I'd deviate from this design would be when profiling shows a clear performance win. I'd change the function to perform validation "up front", and remove per-iteration validation. Of course, you must guarantee that there are no paths of execution that would bypass the "up front" validation, and as such, I'd consider the removal of parameter validation from a called function as grounds to inline that function in the caller, just to avoid a potential path of execution that lacked validation.

  • I think that every function checking the validity of its arguments is probably overkill (it gets into the "IsBadXxxPtr" problem).  If the function is internal, then assert() is fine.

    But assert() is not and can not be a substitute for defensive programming.

  • You're right, that article is pants. First, it's not clear that "safe" functions are any slower. Modern processors can schedule three instructions at the same time, especially trivia like using a register to keep track of the free space in a buffer. Second, if "safe" functions are too slow (implausible as this is), then use a better string representation. If you're checking all string lengths at the boundary then why not convert them to a better representation, such as counted strings, at the same time. This has the further advantage that there's no chance of unchecked strings slipping through to your potentially unsafe code. But your automotive analogy is poor. Most drivers have a certain level of acceptable risk. If you reduce their risk with safety features they'll drive more recklessly to increase the risk back to their acceptable level. So the gross effect of safety features is no improvement in driver safety coupled with increased risk for pedestrians, cyclists, etc. A net loss. Much research supports this.
  • I hate to say it, but Dr Dobbs lost me many year ago when the published an article about a new super fast sorting algorithm. What was the great advancement? Swapping pointers to elements instead of elements. Um... yeah... Makes me wonder if there is any type of review. It really isn't the articles that I know are stupid that scare me, it is the articles that I don't know that they are stupid.
  • You hit the nail on the head.  And it's an extremely wide-spread problem.  And, as you say, it's common to simply fall-back to "just design it correctly" and use of str*() is fine.  If that were the case we wouldn't have security issues with software.

    The scary part is; there's 3rd party libraries modeled around the CRT--rife with unsized pointer arguments, meaning there's no possible way it can reliably ensure it's safe.

  • I agree with your criticisms of the article. By the way, though gets isn't the only problem, I was amazed to see that writeup. Around 10 years ago Doug Gwyn was defending use of gets in situations where he felt its use was sufficiently controlled. I was amazed then too. No one agreed with him, which of course wasn't amazing a bit. Now someone agrees with him. > (oh no, not another stupid automotive analogy). If you drive > safely, you don't need seat belts Yeah, there are people who say that, who really believe that as long as they drive safely there won't be some idiot coming the other way and suddenly making a right turn[*] right in front of you with no possibility for you to stop in time, there won't be some idiot barrelling through a red light after everyone else who has the red light already stopped and others who have the new green light already passed through, etc. [* In some countries think about that as a left turn instead.] Meanwhile did you read his footnotes? >> [3] When Borland was working on its first compiler that >> supported Windows, we had a mysterious crash in one of >> the programming examples from Microsoft's Windows SDK. >> The program ran fine with Microsoft's compiler and library, >> and crashed with ours. Since our compiler and library were >> incomplete, we naturally focused on them. But it turned out >> that the problem was in the example, which had a buffer >> overrun. We can STILL find problems like that, both in example code in MSDN articles and in some of the header files that Microsoft distributes as source. A few days ago for some reason I was reading part of winnt.h, two versions of which get included in programs build by Visual Studio 2005 for Windows Mobile 5 targets. I don't think I saw anything that would cause a buffer overrun, but some of the contents still sent chills up my spine. Then think about the stuff that Microsoft doesn't publish because it's likely even buggier than the stuff they do publish.
  • Great article Larry. The old saying 'don't believe everything you read' is quite true here. Without people like you willing to stick their head up, some of us less experienced programmers might follow their advice with the inevitable consequences.
  • OpenBSD guys created a safe strlcpy: http://www.gratisoft.us/todd/papers/strlcpy.html
  • This is offtopic, but your blog comments seem to randomly switch between supporting HTML syntax and not.

    e.g. This should be a new paragraph (using <p>).

  • Yeah, I know.  We recently switched the site to CS 2.1 and this apparently is a side effect.  I've complained to the appropriate people.

  • Why not use strcpy if I've proven that the parameters will not cause an overflow? Data that comes from file/network/user input can be anything and it has to be appropriately handled and checked. But internal (or processed by your carefully designed app) data can be guaranteed to be safe. In this case assert() is enough.
  • Sergei: It's possible to use strcpy safely.  But you need to first verify that the source string being copied is shorter than the destination buffer, which requires traversing the string twice (once to find the length, the 2nd time to copy the data).  But why NOT use strcpy_s which does both at the same time?

    Otherwise you're trusting your callers.

    I've had MANY situations where developers made a change, tested it on the retail bits and checked in the change.  Only later, when someone was testing the debug bits did they realize that the other developer caused an assert to fire.  That's because asserts are only used to validate invariants.  Asserts are fine, but since they're not always present, they aren't a suffcient protection.

    People keep on trying to give arguments why they should use broken paradigms.  Why?  There are replacements that have equivilant performance characteristics that are safe, why not use them?

  • >one of the critical bugs in strncpy - strncpy isn't guaranteed to null terminate the output string

    If it did there would be no way to detect the case when the destination is too small to hold the whole source.

  • sergei: Interesting.  I've never heard of that usage for strncpy (checking the byte at <length> for 0).  I much prefer APIs that return errors when they fail as opposed to having to probe for side-effects.

  • If you're the world's best and safest driver, seatbelts protect you from all the morons out there. If you're the world's best and safest programmer, safe string copy functions protect your code from all the morons out there who accidentally work around your safety checks. And, also, if you're not quite as great as you think you are, they keep you from getting a whole lot of egg on your face. ;)
Page 1 of 2 (23 items) 12