One thing that continues to amaze me are the powerful tools available to developers and QA nowadays. Application performance can be improved through profiling and optimization tools operating statically and/or dynamically on the binary (using PGO for example). Testing metrics become more accurate when using instrumentation and code coverage tools. Compatibility and conformance issues can be detected with applications like AppVerifier (targeting Windows XP) and FxCop (targeting .NET Framework). And then there is the security space which is increasing in importance and where Microsoft has invested much effort and research. In this space we have various runtime validation techniques and verifiers (/GS, heap verifier), static analysis of object code (fxcop again), static analysis of source code (prefast, which is the codename for the /analyze compiler switch, and SAL annotations). While there is no tool that can do everything, and some overlap in their purpose, used in combination they ensure high quality product development.
In this post, I will focus more on static source code analysis using prefast. I believe the justification for using it is very strong. First, it can find very serious defects, that everyone is afraid of (crashes, blue screens, freezing, exploitable security holes) and which may be quite hard to detect with code inspection and debugging. Prefast is able to detect most cases of: memory management (leaks), pointer management (double free, freeing pointer to freed or non allocated memory such as stack or a global variables, freeing in the middle of the block, return pointer to local), initialization (using uninitialized variables, freeing/dereferencing uninitialized/null pointer, writing to constants), and boundary violations (buffer over/under runs). Because static analysis means source code scanning, the defects are signaled at compile time. The earlier in the product cycle the better and because we’re operating directly on source code the error messages are more explanatory and precisely located. There is no debugging needed to find the wrong line of code and the involved variables. After a short accommodation time, the error messages suggest very obvious solutions. Even if static analysis tools are not meant to replace code reviews and test plans, they are an excellent extra validation and may offer even more coverage than test cases.
Nobody is saying that automated tools can find all the defects, nor that they don’t generate noise. There is always a tradeoff to consider between diagnostic accuracy, completeness and speed. But overall, experience proves that these tools can find plenty of serious bugs, at the expense of some reasonable noise. Static code analysis tools are evaluating all possible paths from the beginning of the program to every exit point. Actually, depending on the speed vs. performance requirements, the number of paths processed at one time can be limited to a certain value that can be set. Also, when there is enough information in the code, some paths may be deduced as unreachable and skipped. Now it becomes obvious that noise comes from paths that are not skipped because there is not enough information in the code for the tools to analyze or understand. Also, the maximum value of paths number may cause defects not to be reported. However, these scenarios are quite rare and don’t overshadow the benefits of using static analysis tools.
Prefast is meant to be run, by both devs and QA, before checking into the depot as a quality requirement. It’s designed to execute quickly and this performance requirement comes with a price: less thorough code scanning and some limitations. Otherwise, the pain and slowness of running it would not be compensated by the benefits. It doesn’t scale well to large code bases, has no information about global state, there are constructs it doesn’t understand, and it is not able to look recursively into code. While analyzing each function, the only thing that prefast is able to do with code called from outside is check the signature annotations of all called functions, but without analyzing their implementation (however, analyzing their implementation is not skipped, every function ends up under the radar as a main unit). Also, it doesn’t detect defects in non-instantiated templates.
I can give you a few examples of the non-recursiveness limitation. If you are about to call a function having a parameter required to be not null, prefast still emits a warning if you are checking for the parameter nullness in a function. If you express the check explicitly in the current function body or with a macro, then prefast understands that the requirement is met. In other cases, you may get a dereferencing NULL pointer warning, even if a few lines above the NULL case is checked but there is a no-return function (like one that throws, or like an invalid parameter handler) instead of an explicit return. Fortunately, there is a workaround for this scenario: to annotate the no-return function with __declspec(noreturn), construct that is well understood by prefast.
Prefast works very well with macros because analysis happens after the preprocessing step of compilation. No check is lost, unless the wrong macros are chosen to perform variable checks. As an example, if a check is required and performed with an assert, the prefast warning is not emitted for debug but you’ll still see it in a retail build (unless the involving code is also hidden from retail). Hence, an ensure macro is more suitable, if the current function may throw (otherwise use a not throwing macro, but seen by retail builds). Also avoid using assume/restrict (or macros using these keywords) when trying to fix a prefast warning, because they generate compiler optimizations and the effect may be only a false prefast silence, allowing defects to remain undetected.
SAL annotations (Standard Annotation Language) can increase significantly the power of static analysis tools and the accuracy of their diagnostic. Requirements and restrictions of parameters, function signatures, execution flow, otherwise subtle to deduce from the source code itself, can be expressed in a formal way understood by prefast. The effect is not only to reduce the noise, but also to intensify the analysis according to the specified behavior. For example, to correct a possible dereference of a NULL pointer inside a function, you either check for nullness before and exit in case of failure or, if the logic of function is to never have that parameter NULL, a simple annotation in the function signature will silence the warning inside the current function, but analysis will start to investigate the same scenario in all callees, aspect that was not checked before the annotation was set. Prefast warnings may be caused not only by code defects, but also by wrong use of annotations. Annotations must always reflect the specification of the function, and not be set only to silence prefast warnings.
There are many categories of annotations: applicable to function parameters, buffers, code behavior and execution flow. They can describe very complex requirements, but I won’t focus on enumerating them in this post. See a comprehensive list here. Our CRT/ATL/MFC headers are full of them…check them out. After understanding what SAL keywords describe, reading annotated code is quite easy.
SAL annotations can be applied to native code only. They help catching defects mostly in C code, but C++ gains great benefits too, even if its object oriented nature offers higher levels of security and less holes. Even high level abstractions like STL have some annotations here and there.
Annotating large code bases may be painful for the developer, I admit, but the benefit is great. Think that with the effort of reviewing and annotating every function signature (definition plus declaration) all calls to the function benefit automatically. Legacy code is still an important factor nowadays. Not many software companies can afford to rewrite all their products in safer high level languages, although nothing is preventing them from doing so regarding new features/products (since interop and compatibility were always VC++ focus). Hence, choosing to annotate old C/C++ sources (in combination with running static analysis tools like prefast) to increase security and remove defects is the best approach to take.
I promise more in depth information in my future vcblog entry. I will compile a useful list with specific prefast warning scenarios (from those I have recently encountered while cleaning up our libraries with prefast help), developer actions in solving a defect or in silencing a particular warning, SAL coding guidelines, prefast limitations and workarounds, integrating annotated with non annotated code bases.
For more information, please check out these resources also: Donn Terry channel9 video, Prefast for Drivers, Michael Howard introduction to SAL, MSDN SAL annotations reference.
SDE Visual C++ Libraries
PingBack from http://msdnrss.thecoderblogs.com/2008/02/05/prefast-and-sal-annotations/
Looking forward to future posts on this issue particularly those about specific warning scenarios you've encountered.
Thanks for the post, I *highly* value static code analysis for C++.
I had thought that prefast was for kernel-mode/driver development. How well does prefast work with user-mode and MFC?
I think SAL and Prefast are excellent ideas, but there are two reasons why I won't be using them for now:
1. In Windows CE most of the SAL macros aren't defined.
2. To use Prefast we have to buy VS2005 Team Server licences, which are considerably more expensive. Since Prefast is the only feature we need I can't justify the cost to my company.
Until Prefast is made available to all paying VS2005 users, or at a reasonable additional cost, it's use will be restricted to large companies.
I am sorry for the confusion. Visual Studio Prefast (the one invoked by the /analyze switch) is a different (although similar) tool than PFD (Prefast for Drivers, belonging to Windows Device Driver Kit).
Visual Studio Prefast has all the checks except for the device-driver ones, but is still appropriate for user and kernel mode.
My only experience in using it is on CRT, ATL and MFC libraries and I can say so far that is not perfect but it works very well.
Hello. I wanted to pop in and take a few moments to answer some of these questions.
'PREfast' as a name is in the DDK with rules focused on device drivers. However, it is also folded into the compiler as the /analyze command line switch. The /analyze supporting C++ compiler is offered in three places. It is in VS Team Dev, VS Team System, and the Windows Platform SDK. The integration to VS is a Team System effort; and we are investing significantly in it going forward. We wanted to make it available to a wider audience because of the security and platform support nature of many of the warnings, thus the desire to provide it for command line based builds in the WinSDK.
Also, in VS 2008, the SourceAnnotations.h shipped in CE and non-CE include directories are identical, and appropriate for annotating user code. It is true the that Windows CE annotated headers and their macro support have varied version to version; but this is why the VC include directories have macro definitions for use.
It (VS Team edition prefast) would also be nice if it would work on all targets (e.g. X86, X64, Itanium) - currently it only works with X86 targets. Is this planned for a future release/Service pack?
Currently we have so few rules that need to be aware of the architecture that we have not made a big investment in the other targets. This is primarily because, to date, it seems most users will still target x86 even when they are also targetting the other architectures.
Internally we have played with the other targets and done runs over significant code bases. What we basically found is that we had enough issues with the port that we would need quite a bit of QA effort to make it ship-worthy; and yet it was not having significant value as the only real qualitative differences were in #ifdef'd code that got hit.
Now, that said; of course we are going that direction, at least for x64; I just don't know if we are talking 1 release away, or two.
[Nacsa Sándor, 2009. január 13. – február 3.]  A minőségbiztosítás kérdésköre szinte alig ismert