A few years back, Andy wrote a series of articles describing how to componentize a 3rd party application. In Part 5 of the article, he described how to use Dependency Walker to identify the first order dependencies. But in Part 6, he reveals that the componentized application didn’t, initially, work because "something was missing in the runtime", in other words, an undiscovered dependency. Andy resolved this issue by describing a different method for finding the missing dependency. This approach might have left a few of you wondering why an experienced componentization engineer like Andy didn’t use a single approach to discover all dependencies. To answer this question, we need to look a little bit deeper at dependencies.
Let’s start with a definition of a dependency. My definition goes something like this – A dependency is any entity (file, registry key, registry value, process, service, etc.) which must be present for an executable binary to function appropriately. Typically, appropriate behavior is defined as the same behavior as on XP Pro.
The first approach taken by Andy, in Part 5, was to use a "static analysis" tool to discover dependencies. This approach can find only load-time DLL dependencies which the program loader must resolve to, well, load the code into memory. While static analysis is limited to finding a single type of dependency, it is the most common dependency type and is super easy to discover. It’s easy because it’s all in the "PE" header which is well documented and understood*. Throw a rock in Redmond and you’re likely to hit a static dependency analysis tool. No, wait, I work in Redmond, make that a Nerf rock <grin>.
To deal with the shortcomings of static analysis, Andy turned to using a "dynamic analysis" tool. This approach, if the tool is cleaver enough, can find all categories of dependencies. A dynamic analysis tool typically finds dependencies by monitoring API calls made while an application executes. Monitoring each and every API call would generate a vast amount of information, most of which wouldn’t be helpful, so only dependency creating API calls are monitored. This still yields a large amount of data because some API functions are called frequently (e.g. RegOpenKey) and there are a lot of API functions which can create a dependency. However, the primary limitation of dynamic analysis is that if you don't happen to execute the code which creates the dependency, you don't discover the dependency.
So, there you have it. You can use either static or dynamic dependency analysis. Both have significant limitations and neither can be expected to be perfect at discovering dependencies. The best one can do is to do as Andy did – use static analysis to get the low hanging fruit, test and, if behavior doesn’t meet expectations, use dynamic analysis to discover what was missing.
*For more information on the PE header, see Part 1 and Part 2 of an article written by Matt Pietrek for the February & March 2002 issues of MSDN Magazine.
-Jim