This post is adapted from an internal mail.  The customers were somewhat confused about the reason their callstacks looked so different in Sampling mode and Instrumentation mode.  Let's say your program consists of only 2 DLLs, foo.dll and bar.dll. 

Foo.dll has two functions, Foo1 and Foo2.
Bar.dll has two functions, Bar1, and Bar2.

Imagine this call stack: Foo1 calls Bar1 calls Bar2 calls Foo2.

In sampling you would see: ( as expected ):

Foo1
            Bar1
                        Bar2
                                    Foo2

Now let's say you look at the sampling stats and determine that your actual problem is in foo.dll since all the exclusive samples are in Foo2.
So you instrument foo.dll (and NOT bar.dll) to drill down on it.  The callstack would be:

The callstack would be:

Foo1
            Bar1
                        Foo2

This is a little counter-intuitive.  The reason is that we add probe points the entry point of each function AND any call site that is external to the dll

So in the body of Foo1 and foo2:

Void Foo1()
{
            FUNC_ENTER(foo1);

            // do some stuff

            EXTERNAL_CALL_ENTER(foo1,bar1);
            Bar1();
            EXTERNAL_CALL_EXIT(foo1,bar1);

            // do some more stuff

            FUNC_EXIT(foo1);         
}

Void Foo2()
{
            FUNC_ENTER(foo2);
            //stuff
            FUNC_EXIT(foo2);                     
}

The sequence of probes we see leads to the truncated call stack above, which lacks knowledge of Bar2.