Posts
  • PFE - Developer Notes for the Field

    What does High Performance Mean?

    • 0 Comments

    Over my years in PFE and Microsoft as a whole I have worked on some REALLY cool websites and applications that are used around the world and drive a lot of business.  Getting work on these cool projects is one of the main reasons that I love what I do.  Getting to see what cool problems people are solving with Microsoft technology.  One of those cases was highlighted a few months back on Channel9!

    http://channel9.msdn.com/shows/ARCast.TV/ARCastTV-Steve-Michelotti-of-eimagination-on-High-Performance-Web-Solutions/

    Steve Michelotti does a great job of explaining some of the stuff that we did and what the applications look like.  So check it out!

    Thanks,

    Zach

  • PFE - Developer Notes for the Field

    Passing a Managed Function Pointer to Unmanaged Code

    • 0 Comments

    I was working with a customer a while back and they had a situation where they wanted to be able to register a managed callback function with a native API.  This is not a big problem and the sample that I extended did this already.  The question that arose was – How do I pass a managed object to the native API so that it can pass that object to the callback function.  Then I need to be able to determine the type of the managed object because it is possible that any number of objects could get passed to this call back function.  I took the code sample that is provided on MSDN and extended it a bit to handle this.  Here is the original sample - http://msdn.microsoft.com/en-us/library/367eeye0.aspx 

    The first commend here is that this is some DANGEROUS territory.  It is really easy to corrupt things and get mess up.  For those that are used to manage code this is very much C++ territory where you have all the power and with that comes all of the rope that you need to hang your self.

    As I mentioned about the customer wanted to be able to pass different types of objects in the lpUserData so the callback can handle different types of data.  So the way to do this is with a GCHandle.  Basically this gives you a native handle the object.  You pass that around on the native size and when it arrives on the managed side you can “trade it in” for the managed object.

    So let’s say you have some sort of managed class.  In this case I called it “callbackData” and you want to pass it off. 

    callbackData^ cd = gcnew callbackData();

    First you need to get a “pointer” to the object that can be passed off:

    GCHandle gch2 = GCHandle::Alloc(cd);
    IntPtr ip2 = GCHandle::ToIntPtr(gch2);
    int obj = ip2.ToInt32();

    You need to keep the GCHandle around so you can free it up later.  It is very possible to leak GCHandles.  In addition GCHandles are roots to CLR objects and therefore you are rooting any memory referenced by the object you have the GCHandle for so the GC will not release any of that memory.  This however is not as bad as pinning the memory.  The GCHandle is a reference to the object that can be used later to retrieve the address of the actual object when needed.

    From the GCHandle you can convert that to a an IntPtr.  Once you have an Int from the IntPtr that you can pass around to the lpUserData parameter.  You could also just directly pass the IntPtr but I just wanted to demonstrate taking it all the way down to the int. 

    Then you can pass that value as part of an LPARAM or whatever you want to the native code. 

    int answer = TakesCallback(cb, 243, 257, obj);

    That will get passed to your callback that uses your delegate to call into your managed function.  The native code might do something simple like this:

    typedef int (__stdcall *ANSWERCB)(int, int, int);
    static ANSWERCB cb;
     
    int TakesCallback(ANSWERCB fp, int n, int m, int ptr) {
       cb = fp;
       if (cb) {
          printf_s("[unmanaged] got callback address (%d), calling it...\n", cb);
          return cb(n, m, ptr);
       }
       printf_s("[unmanaged] unregistering callback");
       return 0;
    }

    In inside your callback function you simply get back an object:

    public delegate int GetTheAnswerDelegate(int, int, int);
     
    int GetNumber(int n, int m, int ptr) {
       Console::WriteLine("[managed] callback!");
       static int x = 0;
       ++x;
     
       IntPtr ip2(ptr);
       GCHandle val = GCHandle::FromIntPtr(ip2);
       System::Object ^obj = val.Target;
       Console::WriteLine("Type - " + obj->GetType()->ToString() );
     
       return n + m + x;
    }

    Once you have the object you can cast it or do whatever you need to move forward.  This approach allows you to bridge across the managed and native world when you have a native function that makes a callback.  I will attach a complete sample for you to play with if you interested and let me know if there are any questions.  There are other ways to approach this I am sure but this seemed to work pretty well and as long as you manage you rGCHandles and callback references you should be good to go!

    Thanks,

    Zach

    REFERENCES

  • PFE - Developer Notes for the Field

    How to set a break point at the assembly instruction that returns an error code

    • 0 Comments

    By Linkai Yu 

    When debugging, there are those situations when you don't have access to source code, you might have debugging symbols of a program that neither crashes nor generates an exception but simply displays an error message such as "The program encounters an error: 12055". So you can't use a debugger to break into the precise location of the error. By the time the error message is displayed, all you can see on the call stack is the code related to error display. The real location of instruction where the error was encountered is gone from the call stack. If you can set a break point at the instruction that determines the error, you might look at the assembly instructions and guess what could cause the problem. The question then is how can we find the instruction that generates the error code?

    Here is what I do, not bullet proof but with a good success rate.

     The idea is to search the relevant DLL(s) for the error number. Then verify the error number by unassemblying the instructions backward from the location to see if the instructions makes sense. If you see instruction pattern mov <target>, <value>, then most likely it's the right instruction.

     The following is an example of an error code returned from WININET.DLL: 12055.

    step 1. get the hex of 12055 by debugger command "?"

    0:003> ?0n12055
    Evaluate expression: 12055 = 00002f17

    step 2. find the start and end address of WININET.DLL

    0:003> lm m wininet
    start    end        module name
    46a70000 46b40000   WININET    (private pdb symbols) 

    step 3. search for hex 00002f17. since the search starts from lower address towards higher address, the serach byte order is: 17 2f 00 00

    0:003> s 46a70000 46b40000 17 2f 00 00
    46ac2623  17 2f 00 00 8b 8e d8 00-00 00 57 e8 07 3e ff ff  ./........W..>..
    46ac2870  17 2f 00 00 e9 27 5a ff-ff 8b 46 34 83 e8 07 0f  ./...'Z...F4....
    46ac4273  17 2f 00 00 0f 84 62 4c-fc ff 81 f9 7d 2f 00 00  ./....bL....}/..
    46afcbbb  17 2f 00 00 0f 84 46 01-00 00 48 48 0f 84 3e 01  ./....F...HH..>.
    46b0e9b8  17 2f 00 00 7d 00 00 00-30 f0 af 46 01 00 00 00  ./..}...0..F....

    this search turns out 5 results.

    step 4. unassemble backward from the location to verify if the instructions make sense.

    0:003> ub 46ac2623  +4
    WININET!ICSecureSocket::ReVerifyTrust+0x1f0 [d:\vista_gdr\inetcore\wininet\common\ssocket.cxx @ 1380]:
    46ac2600 2f              das
    46ac2601 0000            add     byte ptr [eax],al
    46ac2603 755a            jne     WININET!ICSecureSocket::ReVerifyTrust+0x292 (46ac265f)
    46ac2605 8b8d24fdffff    mov     ecx,dword ptr [ebp-2DCh]
    46ac260b 8b86d8000000    mov     eax,dword ptr [esi+0D8h]
    46ac2611 81c900000004    or      ecx,4000000h
    46ac2617 0988f8020000    or      dword ptr [eax+2F8h],ecx
    46ac261d c78528fdffff172f0000 mov dword ptr [ebp-2D8h],2F17h 

    0:003> ub 46ac2870  +4
    WININET!ICSecureSocket::VerifyTrust+0x4e9 [d:\vista_gdr\inetcore\wininet\common\ssocket.cxx @ 1674]:
    46ac2844 89bd24fdffff    mov     dword ptr [ebp-2DCh],edi
    46ac284a 899d1cfdffff    mov     dword ptr [ebp-2E4h],ebx
    46ac2850 399d1cfdffff    cmp     dword ptr [ebp-2E4h],ebx
    46ac2856 748f            je      WININET!ICSecureSocket::VerifyTrust+0x422 (46ac27e7)
    46ac2858 8b86d8000000    mov     eax,dword ptr [esi+0D8h]
    46ac285e 8b8d24fdffff    mov     ecx,dword ptr [ebp-2DCh]
    46ac2864 0988f8020000    or      dword ptr [eax+2F8h],ecx
    46ac286a c78528fdffff172f0000 mov dword ptr [ebp-2D8h],2F17h 

    of the five search results, two show valid instructions: 

       c78528fdffff172f0000 mov dword ptr [ebp-2D8h],2F17h 

     [ebp-2D8h] points to a local variable. The C++ equivalent code would look like:

        intError = 12055;

    now you can set break point at address 46ac286a and 46ac261d and have fun debugging!

    bp 46ac2623;bp 46ac2870 ;g

    Thanks

    Linkai Yu

  • PFE - Developer Notes for the Field

    Problems with x86 performance Monitor Counters on x64

    • 0 Comments

    This has come up a few times recently so I thought I would highlight a post from Todd Carter about how to get the x86 ASP.NET performance counters when running on an x64 OS:

    http://blogs.msdn.com/toddca/archive/2007/05/08/logging-32bit-asp-net-performance-counters-on-a-windows-2003-64bit-os.aspx

     

    Just thought I would call that posting out since it has been helpful.

  • PFE - Developer Notes for the Field

    IIS and Hyper-V

    • 0 Comments

    The other day the topic came up about best practices with IIS and Hyper-V.  One of our team mates pointed out the following links which are an interesting read:

    http://www.virtualization.info/2008/05/microsoft-migrates-msdn-and-technet-on.html

    http://blogs.iis.net/mailant/archive/2008/05/29/technet-msdn-iis7-front-end-servers-running-100-virtualized-using-hyper-v.aspx

    These go over the migration of MSDN/TechNet infrastructure over to Hyper-V.  Cool stuff!

  • PFE - Developer Notes for the Field

    Windows Server 2003 Service Pack 2 and 24-bit Color TS Connections

    • 4 Comments

    Okay this is a really weird one!  When you install Service Pack 2 on Windows Server 2003 you can no longer connect using MSTSC via Terminal Services (TS)/Remote Desktop(RDP) with a 24-bit color setting.  For the customer I was working with this meant that the applications they are hosting on those servers did not look good at all over this down graded connection.  For them it meant a blocker to upgrading to SP2.  This has gotten more important recently because Service Pack 1 for Windows Server 2003 is about to be out of support.

    So a hunt ensued for how to get this working.  It turns out that there was a bug in Service Pack 2 that disable some of these higher bit rates.  The bug was corrected in KB – 942610 although this was not apparent that it applied to my problem since the description was different.  However it turns out that the same underlying issue was causing both problems.  A couple of other changes needed to happen and then everything was working.  Here are the steps that I came up with to take a Win2K3 SP2 box and enable 24-bit Color Connections:

    1. Install the following hotfix - http://support.microsoft.com/kb/942610
    2. Update the following registry key to enable the fix:

      Registry subkey: HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Terminal Server
      Registry entry: AllowHigherColorDepth
      Type: REG_DWORD
      Value: 1

      Note that I found the hotfix does add this registry key but it had the value set to 0 so it disabled the fix.  I had to change it to 1.  Also, typical warnings about editing the registry apply.  If you go crazy you can toast your box so back up stuff first!  For those that are sticklers here is the official warning:
    3. Important This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, click the following article number to view the article in the Microsoft Knowledge Base:

      322756 How to back up and restore the registry in Windows

    1. Ensure the TS box is excepting 24-bit connections
      1. Go to Start->Run
      2. Type “tscc.msc” and hit enter
      3. Click “Connections”
      4. Right click on “Rdp-tcp” and select properties
      5. Click the “Client Settings” tab
      6. Ensure that “Limit Maximum Color Depth” is set to 24-bit
    2. Restart the server to ensure everything is set.

    Hope this helps!

    Thanks,
    Zach

  • PFE - Developer Notes for the Field

    Best Practice - Workflow and Anonymous Delegates

    • 2 Comments

    Best Practice Recommendation

    In your Workflow application (more exactly in the host of your workflow application) never use anonymous delegate like that :

                      AutoResetEvent waitHandle = new AutoResetEvent(false);
                      .
                      .
                      .
    
                      workflowRuntime.WorkflowTerminated += delegate(object sender, WorkflowTerminatedEventArgs e)
                      {
                                waitHandle.Set();
                      };

    Details

    The code above is very common and works correctly at least most of the time... Actually when you do something like that, then after a few iterations (if you always keep the same instance of the Workflow runtime as it should be the case) you will notice the number of AutoResetEvent object always increases until eventually an Out Of Memory exception is raised.

    If you take a debugger like Windbg and display the chain of reference for one particular instance of the AutoResetEvent class you will have an output similar to the following:

    DOMAIN(00000000002FF7E0):HANDLE(WeakSh):c1580:Root:  00000000029126e8(System.Workflow.Runtime.WorkflowRuntime)->
      000000000296e500(System.EventHandler`1[[System.Workflow.Runtime.WorkflowTerminatedEventArgs, System.Workflow.Runtime]])->
      0000000002966970(System.Object[])->  00000000029570e0(System.EventHandler`1[[System.Workflow.Runtime.WorkflowTerminatedEventArgs, System.Workflow.Runtime]])->
      0000000002957038(TestLeakWorkflow.Program+<>c__DisplayClass2)->  0000000002957050(System.Threading.AutoResetEvent)

    If now you dump the instance of TestLeakWorkflow.Program+<>c__DisplayClass2 you will have the following output :

    MT                    Field            Offset        Type               VT         Attr                 Value               Name
    000007fef70142d0      4000003          8             AutoResetEvent     0          instance             0000000002957050    waitHandle

    What does it tell us? This class TestLeakWorkflow.Program+<>c__DisplayClass2 contains one property of type AutoResetEvent. In our case this property has a strong reference to the object of type AutoResetEvent, consequently the final object is still referenced and remains logically memory.

    Why that? Well the purpose of this entry is not to give details about anonymous delegate, but basically there is nothing magic with anonymous delegate, and for the delegate function to access to variable which are not in its scope (look closer, the waitHandle object is not in the scope of the delegate function but still it can access to it) a mechanism is needed. For that, the compiler creates an intermediate class with one property per object which is used in the function delegate but which is normally not in its scope, the delegate function is just a member function of this class; it can then easily access to the properties of the class.

    What if you use hundred of objects in the anonymous delegate, objects which are normally not in the scope of the delegate? Well, you will have an intermediate class with hundred of attributes. The instances of the class will then reference the real objects...

    Why is that a problem? Well, again look closely? How can you get rid of this instance? You have to remove the reference to the delegate but you cannot (not with the syntax used above), it means the intermediary class and more problematically the final object are still referenced and remain in memory.

    To better understand the problem, let’s now study the fragment below, this is a quite classical fragment of code that you will often find in multiple blogs (I know it’s from one of this blog that we have got the code that have been implanted in production and on which I have spent hours debugging to understand why the portal was dying after several hours) :

            WorkflowInstance instance = workflowRuntime.CreateWorkflow(typeof(MyASPNetSequencialWorkFlow));
    
            instance.Start();
    
            workflowRuntime.WorkflowCompleted += delegate(object o, WorkflowCompletedEventArgs e1)
            {
                if (e1.WorkflowInstance.InstanceId == instance.InstanceId)
                { ...

    Yes you got it, in this case this is the Workflow instance which will be referenced and which will stay in memory. Considering the fact that in a web application host (as well as in Windows service host) the WorkflowRuntime is a long live object, you will for sure have an Out Of Memory exception.

    What do then? Well a code like the one below is definitely better :

                        EventHandler<WorkflowTerminatedEventArgs> terminatedHandler = null;
                        EventHandler<WorkflowCompletedEventArgs> completedHandler = null;
    
                        terminatedHandler = delegate(object sender, WorkflowTerminatedEventArgs e)
                        {
                            if (instance.InstanceId == e.WorkflowInstance.InstanceId)
                            {
                                Console.WriteLine(e.Exception.Message);
                                workflowRuntime.WorkflowCompleted -= completedHandler;
                                workflowRuntime.WorkflowTerminated -= terminatedHandler;
                                waitHandle.Set();
                            }
                        };
    
                        workflowRuntime.WorkflowTerminated += terminatedHandler;
    
    completedHandler = delegate(object sender, WorkflowCompletedEventArgs e) { if (instance.InstanceId == e.WorkflowInstance.InstanceId) { WorkflowInstance b = instance; workflowRuntime.WorkflowCompleted -= completedHandler; workflowRuntime.WorkflowTerminated -= terminatedHandler; waitHandle.Set(); } }; workflowRuntime.WorkflowCompleted += completedHandler;

    You again have to be prudent, you must remove all the handlers, if at the end WorkflowCompleted Handler is executed, WorkflowTerminated handler will never be executed hence the all the handler are removed in both delegates.

    Thanks,
    Fabrice

  • PFE - Developer Notes for the Field

    Combining NetMon CAP Files

    • 1 Comments

    Today I was working on a problem that we thought was network related.  That meant it was time to try out some of those NetMon skills.  NetMon is the Microsoft Network Packet Capture and Analysis tool that is similar to WireShark/ethereal and can the current 3.2 version can be download for free from here - http://www.microsoft.com/downloads/details.aspx?FamilyID=f4db40af-1e08-4a21-a26b-ec2f4dc4190d&DisplayLang=en

    For the longest time I am a lot of my colleagues used WireShark/ethereal to do network captures.  However Netmon has come a long way and I really am a big fan of it now.

    Back to my problem today.  We were dealing with a Web Server that has a decent amount of traffic and problem that occurred intermittently.  I first started just capturing with the UI but quickly saw the memory grow and the box slow down a bit.  I did not need to see the traffic going by I just needed to do a capture.  This lead me to the NMCAP tool that ships with NetMon.  This is a basic command line tool that allows you do to do captures.

    So I started capturing and by default if you use the *.chn extension on the file name you specify it will create 20 MB CAP files with your network data.  This as great.  I could easily filter out the ones I did not want while I awaited the problem.

    Finally the problem reproduced and now I was ready to analyze.  I opened the trace closest to when the problem occurred and began working through the trace.  I quickly found some conversations that I was interested but – Where was the beginning? Where was the end?

    The answer to this – In some of the other traces!

    Now I had 3, 4, 5, 6 different traces open and that was no good.  I quickly wanted to be able to combine the 6 or 10 traces that I was interested it.  I found out that the same NMCAP tool can easily do this.  All you have to do is use the /InputCapture param.  You end up with a command line like this:

    namcap /InputCapture Trace1.cap Trace2.cap Trace3.cap /capture /file Trace.cap:500M

    This was a life saver so I thought I would pass it along!  Enjoy and have a great weekend.

    Thanks,

    Zach

  • PFE - Developer Notes for the Field

    Dump Analysis in a Disconnected Environment

    • 1 Comments

    Over the years I have come across the following restrictions in a variety of environments:

    1. The machines that we could do dump analysis did not have access to the internet.
    2. The dump could not leave those machines.

    These types of restrictions is a common reason field team members are brought onsite since they cannot ship data offsite.  The lack of internet access though raises the question - “How do I get the symbols I need?”

    The first option is to use the Symbol Packages that Microsoft provides here - http://www.microsoft.com/whdc/devtools/debugging/symbolpkg.mspx

    Now these symbol packages are a great starting point but they have a few key limitations:

    1. They are only for Windows.  So if I need .NET symbols and such I will be out of luck.
    2. They are only for major release milestones.  This means that if you have system that is patched you will find at least some symbols do not match.  While this will not be a problem in all debug scenarios it will be a problem in certain cases.

    For both of these situations the Microsoft Public Symbol Server (http://msdl.microsoft.com/download/symbols) addresses this because the symbols are more frequently updated and also allows you to get symbols for more Microsoft products than just the OS.  The rub is - “How do I know what symbols to pull down if I cannot get the dump to the internet?”

    Enter SYMCHK - http://msdn.microsoft.com/en-us/library/cc267474.aspx

    This is a utility to validate symbols.  It has a “Manifest” feature that we can utilize here.  The idea is that create a series of steps like this:

    1. Generate a Manifest.
    2. Move that manifest to another machine.
    3. Create a Symbol Cache Using the Manifest.
    4. Move the Symbols to the Closed Machine.

    Let’s look at each of these steps in a bit more detail.

    Generate a Manifest

    To do this you can just run the following command:

    D:\debuggersx86>symchk /id D:\DumpFiles\Dump1.dmp /om symbol.txt

    SYMCHK exists in your the directory where you installed the Debugging Tools for Windows.  You can replace the DMP file above with the location of the dump file that you wish to debug.  The “symbol.txt” file is the manifest that we are generating.

    The resultant manifest will contain a line for each module and look like this:

    kernel32.dll,46239bd5f5000,2
    ntdll.dll,411096b4b0000,2

    Moving the Manifest

    This is probably the trickiest part depending on the place where you work.  The idea here is that the file you have to move is simply a text file of file names.  You can do several things:

    1. Scrub out any custom modules so those do not leave.
    2. Write down the contents of the file if need be.

    The net of this is that you have much more limited set of information to move off the system as opposed to moving a dump or even minidump which might be harder in various areas.

    Creating a Symbol Cache Using the Manifest

    After you have the manifest on a machine with internet access you can then sync down the symbols using a command like this:

    symchk /im symbols.txt /s srv*d:\tempSym*http://msdl.microsoft.com/download/symbols

    You can use any directory in place “d:\tempSym”. 

    Move the Symbols to the Closed Machine

    Once the symbols have come down just copy the cache directory that you used in the previous step (d:\tempSym in the example) and move that over to the closed machine.  Then in the debugger set your symbol path to include that folder. For instance if you copied the folder to d:\tempSym on the close machine you would add “srv*d:\tempSym” to your symbol path to start loading symbols from the folder.

  • PFE - Developer Notes for the Field

    .NET Framework 2.0 Service Pack 2

    • 1 Comments

    I have worked with several customers that want .NET Framework 2.0 SP2 but do not want to either require their customers to install .NET Framework 3.5 w/ SP1 or do not want to do it themselves.

    Well due to the way it was packaged up until this point that was the only way to get it.  However ask of just the other week they finally released a standalone installer for .NET Framework 2.0 Service Pack 2:

    http://www.microsoft.com/downloads/details.aspx?FamilyID=5b2c0358-915b-4eb5-9b1d-10e506da9d0f&displaylang=en

    This is great news and will make the deployment a bit easier for people on Windows XP and Windows Server 2003.  For Vista and Windows Server 2008 you still have to get .NET Framework 3.5 SP1 installed to get .NET Framework 2.0 SP2.

    Finally, make sure that you also install the following update - http://support.microsoft.com/kb/959209.  This update fixes some of the known issues that have been found with the service packs since their release.

    Have a great day and happy installing!

    Zach

  • PFE - Developer Notes for the Field

    Memory Based Recycling in IIS 6.0

    • 3 Comments

    Customers frequently ask questions regarding the recycling options for Application Pools in IIS.

    Several of those options are self explanatory, whereas others need a bit of analysis.

    I’m going to focus on the Memory Recycling options, which allow IIS to monitor worker processes and recycle them based on configured memory limits.

    Recycling application pools is not necessarily a sign of problems in the application being served. Memory fragmentation and other natural degradation cannot be avoided and recycling ensures that the applications are periodically cleaned up for enhanced resilience and performance.

    However, recycling can also be a way to work around issues that cannot be easily tackled. For instance, consider the scenario in which an application uses a third party component, which has a memory leak. In this case, the solution is to obtain a new version of the component with the problem resolved. If this is not an option, and the component can run for a long period of time without compromising the stability of the server, then memory based recycling can be a mitigating solution for the memory leak.

    Even when a solution to a problem is identified, but might take time to implement, memory-based recycling can provide a temporary workaround, until a permanent solution is in place.

    As mentioned before, memory fragmentation is another problem that can affect the stability of an application, causing out-of-memory errors when there seems to be enough memory to satisfy the allocation requests.

    In any case, setting memory limits on application pools can be an effective way to contain any unforeseen situation in which an application that “behaves well” goes haywire. The advantage of setting memory limits on well behaved applications is that in the unlike case something goes wrong, there is a trace left in the event viewer, providing a first place where research of the issue can start.

    When to configure Memory Recycling

    In most scenarios, recycling based on a schedule should be sufficient in order to “refresh” the worker processes at specific points in time. Note that the periodic recycle is the default, with a period of 29 hours (1740 minutes). This can be an inconvenience, since each recycle would occur at different times of a day, eventually occurring during peak times.

    If you have determined that you have to recycle your application pool based on memory threshold, it implies that you have established a baseline for your application and that you know your application’s memory usage patterns. This is a very important assumption, since in order to properly configure memory thresholds you need to understand how the application is using memory, and when it is appropriate to recycle the application based on that usage.

    Configuration Options

    There are two inclusive options that can be configured for application pool recycling:

    image

    Figure 1 - Application Pool properties to configure Memory Recycling

    Maximum Virtual Memory

    This setting sets a threshold limit on the Virtual Memory. This is the memory (Virtual Address Space) that the application has used plus the memory it has reserved but not committed. To understand how the application uses this type of memory, you can monitor it by means of the Process – Virtual Bytes counter in Performance Monitor.

    For instance, if you receive out of memory errors, but less than 800MB are reported as consumed, it is often a sign of memory fragmentation.

    Maximum Used Memory

    This setting sets a threshold limit on the Used Memory. This is the application’s private memory, the non-shared portion of the application’s memory. You can use the Process – Private Bytes counter in Performance Monitor to understand how the application uses this memory.

    In the scenarios mentioned before, this setting would be used when you have detected a memory leak which cannot avoid (or is not cost effective to correct). This setting would set a “cap” on how much memory the application is “allowed” to leak before the application is restarted

    Recycle Event Logging

    Depending on the configuration of the web server and application pools, every time a process is recycled, an event may be logged in the System log.

    By default, only certain recycle events are logged, depending on the cause for the recycle. Timed and Memory based recycling events are logged, whereas all other events are not.

    This setting is managed by the LogEventOnRecycle metabase property for application pools. This property is a byte flag. The each bit indicates a reason for a recycle. Turning on a bit instructs IIS that the particular recycle event should be logged. The following table indicates the available values for the flag:

    Flag

    Description

    Value

    AppPoolRecycleTime

    The worker process is recycled after a specified elapsed time.

    1

    AppPoolRecycleRequests

    The worker process is recycled after a specified number of requests

    2

    AppPoolRecycleSchedule

    The worker process is recycled at specified times.

    4

    AppPoolRecycleMemory

    The worker process is recycled once a specified amount of used or virtual memory, expressed in megabytes, is in use.

    8

    AppPoolRecycleIsapiUnhealthy

    The worker process is recycled if IIS finds that an ISAPI is unhealthy.

    16

    AppPoolRecycleOnDemand

    The worker process is recycled on demand by an administrator.

    32

    AppPoolRecycleConfigChange

    The worker process is recycled after configuration changes are made.

    64

    AppPoolRecyclePrivateMemory

    The worker process is recycled when private memory reaches a specified amount.

    128

    As mentioned before, only some of the events are configured to be logged by default. This default value is 137 (1 + 8 + 128).

    It’s best to configure all the events to be logged. This will ensure that you understand the reasons why your application is being recycled. For information on how to configure the LogEventOnRecycle metabase property, see the support article at http://support.microsoft.com/kb/332088.

    Below are the events logged because the memory limit thresholds have been reached:

    image

    Figure 2 - The application pool reached the Used Memory Threshold

    image

    Figure 3 - The application reached the Virtual Memory threshold

    Note that the article lists the 1177 Event ID for Used (Private) Memory recycling, but in Event viewer it shows as Event ID 1117

    Determining a threshold

    Unlike other settings, memory recycling is probably the setting that requires the most analysis, since the memory of a system may behave differently depending on various factors, such as the processor architecture, running applications, usage patterns, /3GB switch, implementation of Web Gardens, etc.

    In order to maximize service uptime, IIS 6.0 does Overlapped recycling by default. This means that when a worker process is due for a recycle, a new process is spawned and only when this new process is ready to start processing requests does the recycle of the old process actually occur. With overlapped mode, there will be two processes running at one point in time. This is one of the reasons why it is very important to understand the memory usage patterns of the application. If the application has a large memory footprint at startup, having two processes running concurrently could starve the system’s memory. For example, in a 32 bit Windows 2003 system with 4GB memory, Non-Paged Pool should remain over 285MB, Paged Pool over 330MB~360MB and System Free PTE should remain above 10,000. Additionally, Available Memory should not be lower than 50MB (these are approximate values and may be different for each system[1]). In a case in which recycling an application based on memory would cause these thresholds to be surpassed, Non-Overlapped Recycling Mode could help mitigate the situation, although it would impact the uptime of the application. In this type of recycling, the worker process is terminated first before spawning the new worker process.

    In general, if the application uses X MB of memory, and it’s configured to recycle when it reaches 50% over the normal consumption (1.5 * X MB), you will want to ensure that the system is able to support 2.5 * X MB during the recycle without suffering from system memory starvation. Consider also the need to determine what type of memory recycling option is needed. Applications that use large amounts of memory to store application data or allocate and de-allocate memory frequently might benefit from having Maximum Virtual Memory caps, whereas applications that have heavy memory requirements (e.g. a large application level cache), or suffer from memory leaks could benefit from Maximum Memory Used caps.

    The documentation suggests setting the Virtual Memory threshold as high as 70% of the system’s memory, and the Used Memory as high as 60% of the system’s memory. However, for recycling purposes, and considering that during the recycle two processes must run concurrently, these settings could prove to be a bit aggressive. As an estimate, for a 32 bit server with 4 GB of RAM, the Virtual Memory should be set to some value between 1.2 GB and 1.5 GB, whereas the Private bytes should be around 0.8 GB to 1 GB. These numbers assume that the application is the only one in the system. Of course, these numbers are quick rules of thumb and do not apply to every case. Different applications have very different memory usage patterns.

    For a more accurate estimation, you should monitor your application for a long enough period so as to capture the information about memory usage (private and virtual bytes) of the application during the most common scenarios and stress levels. For example, if your application is used consistently on an everyday basis, a couple of week’s worth of data should be enough. If your application has a monthly process and each week of the month has different usage (users enter information at the beginning of the month and heavy reporting activity occurs at the end of the month) then a longer period may be appropriate. In addition, it is important to remember that data may be skewed if we monitor the application during the Holidays (when traffic is only 10% of the expected), or during the release of that long-awaited product (when traffic is expected to go as high as 500% of the typical usage).

    Problems associated with recycling

    Although recycling is a mechanism that may enhance application stability and reliability, it comes with a price tag. Too little recycling can cause problems, but too much of a good thing is not good either.

    Non-overlapped recycles

    As discussed above, there are times when overlapped recycling may cause problems. Besides the memory conditions described above, if the application creates or instantiates objects for which only one instance can exist at a particular time for the whole system (like a named kernel object), or sets exclusive locks on files, then overlapped recycle may not be an option. It is very important to understand that non-overlapped recycles may cause users to see error messages during the recycling process. In situations like this, it is very important to limit recycles to a minimum.

    Moreover, if the time it takes for a process to terminate is too long, it may cause noticeably long outages of the application while it is recycling. In this case, it is very important to set an appropriate application pool shutdown timeout. This timeout defines the length of the period of time that IIS will wait for a worker process to terminate normally before it forces it to terminate. This is configured using the ShutdownTimeLimit metabase property (http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/1652e79e-21f9-4e89-bc4b-c13f894a0cfe.mspx?mfr=true).

    To configure an application pool to use non-overlapped recycles, set the application pool’s metabase property DisallowOverlappingRotation to true. For more information on this property, see the metabase property reference at http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/1652e79e-21f9-4e89-bc4b-c13f894a0cfe.mspx?mfr=true.

    Session state

    Stateful applications can be implemented using a variety of choices of were to store the session information. For ASP.Net, some of these are cookie-based, in-process, Session State Server, SQL Server. For classic ASP, this list is more limited. These implementation decisions have an impact on whether to use Web Gardens and how to implement Web Farms. Additionally, these may determine the effect that application pool recycling has on the application.

    In the case of in-memory sessions, the session information is kept in the worker process’ memory space. When a recycle is requested, a new worker process is started and session information on the recycled process is lost. This effectively affects any active session that exists in that application. This is yet another reason why the amount of recycles should be minimized.

    Application Startup

    Recycling an application is an expensive task for the server. Process creation and termination are required to start the new worker process and bring down the old one. In addition, any application level cache and other data in memory are lost and need to be reloaded. Depending on the application, these activities can significantly reduce the performance of the application, providing a degraded user experience.

    Incidentally, another post in this blog discusses another of the reasons application startup may be slow. In many cases this information will help speed up the startup of your application. If you haven’t done it yet, check it out at http://blogs.msdn.com/pfedev/archive/2008/11/26/best-practice-generatepublisherevidence-in-aspnet-config.aspx.

    Memory Usage checks

    When an application is configured to recycle based on memory, it is up to the worker process to perform the checks. These checks are done every minute. If the application has surpassed the memory caps at the time of the check, then a recycle is requested.

    Bearing this behavior in mind, consider a case in which an application behaves well most of the times, but under certain conditions, it behaves really badly. By misbehavior I mean very aggressive memory consumption during a very short period of time. In this case, the application can cause other problems by exhausting the system’s memory before the worker process checks for memory consumption and any recycle can kick in. This could lead to problems that are difficult to troubleshoot and limit the effectiveness of memory based recycling.

    Conclusion

    The following points summarize good practices when using memory based recycling:

    • Configure IIS so it logs all recycle events in the System Log.
    • Use memory recycling only if you understand the application’s memory usage patterns.
    • Use memory based recycling as a temporary workaround until permanent fixes can be found.
    • Remember that memory based recycling may not work as expected when the application consumes very large amounts of memory in a very short period of time (less than a minute).
    • Closely monitor recycles and ensure that they are not too many (reducing the performance and/or availability of the application), or too few (allowing memory consumption of the application to create too much memory pressure on the server).
    • Use overlapped recycling when possible.

    Thanks,

    Santiago

    Santiago Canepa is an Argentinean relocate working as Development PFE. In his beginnings, he made custom made software in Clipper. Worked for the Argentinean IRS (DGI) while working towards his BS in Information Systems Engineering. He worked as a consultant both in Argentina and the US on Microsoft technologies, and lead a development team before joining Microsoft. He has passion for languages (don’t engage in a discussion about English or Spanish syntax – you’ve been warned), loves a wide variety of music and enjoys spending time with his husband and daughter. Can usually be found online at ungodly hours in front of his computer(s).”


    [1] Vital Signs for PFE – Microsoft Services University

  • PFE - Developer Notes for the Field

    Best Practice – WCF and Exceptions

    • 1 Comments

    Best Practice Recommendation

    In your WCF service, never let an exception propagate outside the service boundary without managing it. 2 Alternatives then :

    • Either you manage the exception inside the service boundary and never propagate it outside
    • Or you convert the .Net typed exception in your exception manager as a FaultException before propagating it.

    Details

    Most of the developers know how to handle exception in .Net code and good strategy can consist in letting an exception bubble up if it cannot be handled correctly, ultimately this exception will be managed by one of the .Net framework default handler which can depending of the situation close the application.

    WCF manages things slightly differently, indeed if the developer let an exception bubbles up to the WCF default handler, it will convert this .Net typed exception (which has no representation in the SOA world) as a Fault Exception (which is represented in the SOA world as a SOAP fault http://www.w3.org/TR/soap12-part1/ ) and then propagate this exception to the client. Up to this point, you think everything is OK, and you are right indeed it’s legitimate the client application receives the exception and manages it. However there are 2 problems here :

    • One minor, indeed by default you don’t control the conversion, and you may want to send a custom Fault Exception to the client application, something not too generic.
    • The WCF default handler will fault the channel and this is far more problematic because if the client reuse the proxy (what is a quite common approach after managing the exception), it will fail. It’s quite easy in .Net to work around this issue just by testing if the proxy is faulted in the exception manager but all the languages may not offer these facilities and I remind you by essence SOA is very opened.

    For those of you who wants more information this blog is very good : http://weblogs.asp.net/pglavich/archive/2008/10/16/wcf-ierrorhandler-and-propagating-faults.aspx from a general point of view, the implementation of IErrorHandler is a popular approach to handle this issue so from your favorite search engine (www.live.com) just search for IErrorhandler.

    Contributed by Fabrice Aubert

  • PFE - Developer Notes for the Field

    Windows Error Reporting (WER) Web Service

    • 0 Comments

    I have long thought that one of the coolest features of the Windows OS for developers is the Windows Error Reporting infrastructure (with roots in Dr. Watson).  You can have your application upload dumps to the WinQual portal – http://winqual.microsoft.com.

    One of the problems though has always been getting those dumps off of the WER site.  Before it was a manual process of going to the WinQual website and download CAB files.  Well I found out today that the WER team recently released a Web Service that allows you to pull them down automatically.   They have also released some sample applications to get you started:

    Windows Error Reporting on CodePlex - http://www.codeplex.com/wer

    This opens up all sorts of possibilities such as automatically creating bugs in your bug DB (TFS hopefully!) in response to a new dump file getting uploaded so that someone on your team can take a look.

    Finally the WER team has a blog up and running (http://blogs.msdn.com/wer) and their first entry discusses this and has some demo videos from PDC:

    http://blogs.msdn.com/wer/archive/2008/11/25/windows-error-reporting-wer-blog.aspx

    Enjoy!

    Zach

  • PFE - Developer Notes for the Field

    Best Practice - <GeneratePublisherEvidence> in ASPNET.CONFIG

    • 2 Comments

    Best Practice Recommendation

    Add the following line to your ASPNET.CONFIG or APP.CONFIG file:

    <?xml version="1.0" encoding="utf-8"?>
    <configuration>
        <runtime>
            <generatePublisherEvidence enabled="false"/>
        </runtime>
    </configuration>

    Note the ASPNET.CONFIG file is located in Framework Directory for the version of the Framework you are using.  For example for a 64-bit ASP.NET application it would be:

    c:\Windows\Microsoft.NET\Framework64\v2.0.50727

    For a 32-bit application it would be:

    c:\Windows\Microsoft.NET\Framework\v2.0.50727

    Details

    I have seen this a bunch of times while onsite.  The problem goes something like this:

    When I restart my ASP.NET application the initial page load is very slow.  Sometimes upwards of 30+ seconds. 

    Many people just blame this on “.NET Startup” costs but there is no out of the box reason that an ASP.NET application should take that long to load.  Some applications do work on startup which can cause a startup slow down but there are other things that cause slow downs.  A common cause that I have seen often recently is Certificate Revocation List (CRL) checking when generating Publisher Evidence for Code Access Security (CAS).

    A little background – CAS is feature in .NET that allows you to have more granular control over what code can execute in your process.  Basically there are 3 parts:

    1. Evidence – Information that a module/code presents to the runtime.  This can be where the module was loaded from, the hash of the binary, strong name, and importantly for this case the Authenticode signature that identifies a modules publisher.
    2. Permissions Set – Group of Permissions to give code (Access to File System, Access to AD, Access to Registry)
    3. Code Group – The evidence is used to provide membership in a code group.  Permission Sets are granted to a code group.

    So when a module loads it presents a bunch of evidence to the CLR and the CLR validates it.  One type of evidence is the “publisher” of the module.  This evidence is validated by looking at the Authenticode signature which involves a Certificate.  When validating the Certificate the OS walks the chain of Certificates and tries to download the Certificate Revocation List from a server on the internet.  This is where the slowdown occurs.

    A lot of servers do not have access to make calls out to internet.  It is either explicitly blocked, the server might be on a secure network, or a proxy server might require credentials to gain access to the internet.  If the DNS/network returns quickly with a failure the OS check will move on but if the DNS/network is slow or does not respond at all to the request we have to timeout. 

    This can occur for multiple modules because we create this evidence for each module that is loaded.  However if we have looked for a CRL and failed we will not recheck.  However different certificates have different CRLs.  For instance a VeriSign Certificate may have one CRL URL but a Microsoft Certificate will have a different one.

    Since this probe can slow things down it is best to just avoid the probe if you do not need it.  For .NET the only reason you would need it is if you are setting Code Access Security based on the module Publisher.  Because this can cause potential slow downs and you do not need to occur this penalty you can just disable the generation of the Publisher Evidence when your module is loaded.  To disable this use the <generatePublisherEvidence> Application configuration.  Just set the enabled property to false and you will avoid all of this.

    Now for ASP.NET applications it was not immediately obvious how to do this but it turns out that you cannot add this to an applications Web.Config but you can add it to the ASPNET.CONFIG file in the Framework directory.  For other applications just add the attribute to the APP.CONFIG file.

    In closing there are several blog entries that do a great job of demonstrating how this will show up in the debugger and other details on CRL issues and workarounds:

    We are highlighting this as the first in a series of general best practice recommendations.

    NOTE – If you have .NET 2.0 RTM you will need this hotfix - http://support.microsoft.com/default.aspx/kb/936707

  • PFE - Developer Notes for the Field

    A Few TFS 2008 Pre & Post Installation things you need to know

    • 1 Comments

    The PFE Dev team would like to welcome Micheal Learned to the blog.  Here is a bit about Mike:

    My name Is Micheal Learned, and I’ve been working for Microsoft for over a year now with some of our Premier customers across the US, and helping them support a variety of .NET related systems and tools.  I’m doing a lot of work currently with VSTS/TFS 2008, and helping customers get up and running with new installations, migrations, and general knowledge sharing around this product.  I also enjoy working with web applications, IIS, and just researching various new Microsoft related technologies in general.  In my free time I enjoy sports (I watch now, use to like to play), and spending time with my son Nicholas (7), and daughter Katelyn (3),  and generally just letting them wear me down playing various kids stuff.

    Enough about here is the good stuff:

    I’ve been spending a lot of time recently working with customers on TFS and VSTS 2008. Some of my customers have required some support around getting a clean install, or troubleshooting some post-install issues, so I thought I would post a few of the common issues I’ve been seeing.

    First of all, for any pre-install or post-install issues, I highly recommend you run the TFS Best Practices Analyzer, which is available as a “power tool” download here.

    The analyzer can run a system test, and generate a report indicating the health of your existing system, or a report identifying possible issues with your environment configuration ahead of a TFS installation (pre-install). The information is useful, and generates links to help files that can help you solve various issues. It helped me through a range of issues while troubleshooting an installation recently, including IIS permissions, registry information, etc. The idea is that more and more data collection and reporting points are being added to this tool, so you should keep an eye out for future releases of the tool as well.

    Before doing any new installation it is imperative you download the latest version of the TFS 2008 installation guide available here. I can’t overstate the importance of following the guide step by step with any new installation, and although this version of TFS is a smoother install than the previous version (2005), it is still important you pay close attention to setting up the proper accounts and their perspective network permissions. It may seem a little cumbersome, but you’ll want the TFS services, and various accounts running with the least sufficient permissions as a security best practice, and it is well documented in the guide.

    Enough about general stuff, let’s discuss some specific points:

    “Red X of Death”

    If after installation, you’re client machines are seeing a red “X” on either their documents or reports folders in Team Explorer, then they are encountering a permissions issue on SharePoint (if documents folder), or the Reporting Services layer (if reports folder). I’ve seen this to be quite common, and it should just be pointed out that you should configure proper permissions for your Active Directory users at each of those tiers. Visual Studio Magazine Online has a nice article documenting this here http://visualstudiomagazine.com/columns/article.aspx?editorialsid=2742 .

    “Firewall Ports”

    If you are running TFS, and have a firewall in play, you will need specific ports for the various TFS components to communicate properly. The ports are well documented in the TFS guide under the “Security Issues for Team Foundation Server” section, and there is some automation with new installations that will configure your windows firewall ports for you during install. If you need to configure them post-install, it is just a matter of setting up the firewall exceptions manually.

    “TFS Security

    TFS security is deserved of its own blog entry, but I want to just mention a few quick items since TFS security can be a common stumbling block initially. The TFSSetup account (as described in the TFS guide) is by default an administrator on the TFS server. You’ll need to configure permissions on several tiers for TFS for your other users and groups. Specifically configure permissions at the overall TFS server level, TFS project level, Reporting Services level, and SharePoint level. It may seem cumbersome at first, but just right click your server node in Team Explorer, and it becomes a point and click exercise. Right click the project node to configure project level permissions. As a best practice for manageability scenarios you will want to simply use Active Directory Groups to drive this, and if you have a need, you can get very granular for setting up permission, new roles, etc. Also there is a power toy available that gives you a singular GUI view for managing permissions all in one place that you can download here http://www.codeplex.com/Wiki/View.aspx?ProjectName=TFSAdmin .

    “Just remember TFS is multi-tier”

    Finally I would just point out to keep in mind that TFS 2008 is a distributed system with multiple tiers. Open IIS or SQL Server to poke around and look at the databases, directories, permissions, etc to familiarize your-self with the architecture (Do not directly edit the database – Look, don’t touch!).

    Many customers of course are often “viewing” their TFS server from Visual Studio, and if you see issues with connecting to a TFS project, connecting to a TFS server, or having issues with the various views on top of reports or SharePoint documents, you should initially keep in mind that the underlying reasons probably lie on a IIS site being down, SharePoint configuration, an improper permission, or a firewall port, etc. In a nutshell, focus on permissions and configuration settings at each tier, per the TFS install guide, and your issues are likely to be solved!

    Mike Learned --Premier Field Engineer .NET

    Resources

  • PFE - Developer Notes for the Field

    Why is my Window Grayed Out?

    • 1 Comments

    Starting back in Windows 2000, if I recall correctly, the OS added a feature to try and handle unresponsive windows.  The idea is that if an application hangs we do not want the window that is associated with that application to block other stuff that the user is trying to do.  This feature has been improved through Windows XP and now into Vista.  The feature is not pretty good.

    When the OS detects that the application is not “responding” to messages it will take control of y

    I came across this when working with a customer that has a client side application.  It is single threaded for the most part and has a work flow like this:

    • Start long running Operation
    • Prevent painting main UI and ignore most input to that region
    • Update Status Pane via periodic calls to pump messages for that pane
    • Update full UI after the operation is complete

    This is not an uncommon scenario especially for application that have been around for a while.  Sometimes it can be very hard to allow the window to paint during some long running applications and while allowing painting is the real answer there are times where this is MUCH easier said than done.

    The feature that grays out windows seems to have gotten more aggressive in Windows Vista to better detect various hang scenarios.  This is why this became an issue for them.  Basically it was becoming more common to see the graying or “ghosting” of the application’s window.  This lead to users thinking the application was dead (the status bar also gets grayed out) and terminating the application.  In XP the application would stay there and while the main part of the UI would not respond you could still watch the status bar move.

    The question though became what do we do about this?  The obvious answer is make the application responsive during the long running application.  While this is a goal there is no short term way to do this.  We tried a bit to pump more messages but run into a sticky issue – How do I “appear” responsive without actually being responsive.  We were basically trying to guess what the OS was looking for and since this can change, as we have seen, this is not reliable.

    After some digging I found an interesting API:

    DisableProcessWindowGhosting
    http://msdn.microsoft.com/en-us/library/ms648415.aspx

    Basically this API tells the OS to skip the ghosting even it detects the application as unresponsive.  This API will take effect for the run of the application.

    The real answer is to fix the application to be responsive as  mentioned but that is not always easy so this is a stop gap. 

    Side note is that when you debug an application the graying or ghosting of the window will not occur.  So if you are trying to reproduce make sure that you are running outside of the debugger (this got me when trying to reproduce!)

    Hope this helps and have a great weekend!

    Zach

  • PFE - Developer Notes for the Field

    SN.EXE and Empty Temp Files in %TEMP%

    • 0 Comments

    If you have a build server and are doing delay signing this is probably of interest to you.  When delay signing the final step is to post build run the following command:

    sn -R myAssembly.dll sgKey.snk

    I have seen build setups that basically output all binaries to one folder and then run a loop across all the DLLs executing this command.  That way everything is fixed up and ready for the devs to run.  This way you do not have to worry about adding a post build step to each project or you can add one common script to each project.  The release builds of course get fully signed but for daily/rolling builds this works just fine.  Until you check out your temp directory. 

    It turns out for each call to SN.EXE a TMP file is generated.  These TMP files look like this:

    C:\temp>dir
    Volume in drive C has no label.
    Volume Serial Number is 1450-ABCF

    Directory of C:\temp

    12/11/2007  12:39 PM    <DIR>          .
    12/11/2007  12:39 PM    <DIR>          ..
    12/11/2007  11:17 AM                 0 Tmp917.tmp
    12/11/2007  11:17 AM                 0 Tmp91C.tmp
    12/11/2007  11:17 AM                 0 Tmp921.tmp

    You probably noticed that these are 0 byte files which means space is not an issue but if you have a ton of these it can slow your hard drive down.  Also, the actual SN.EXE process can start slowing down. 

    If you have 100 DLLs and are doing 30 builds a day that is 3,000 temp files.  Now, let’s say you build debug x86, debug x64, release x86 and release x64 that is now up to 12,000.  After a couple of days you could imagine how many files. 

    I was working on this and we found out that SN.EXE is not actually the problem here (Thanks to Scot Brennecke).  The issue is in the Crypto APIs that SN.EXE uses.  These APIs are the ones that create and do not clean up the TMP files.  It turns out that there is a hotfix for this if you are using Windows Server 2003:

    On a Windows Server 2003-based client computer, the system does not delete a temporary file that is created when an application calls the "CryptQueryObject" function
    http://support.microsoft.com/kb/931908

    After applying this hotfix the temp files are no longer created and life was happy.  I just thought I would share because the connection between this hotfix and the problem was not immediately obvious.

    Have a great day!

    Zach

  • PFE - Developer Notes for the Field

    Debugging Large ViewState

    • 1 Comments

    This week I have been working with a customer that had pretty large ViewStates that were getting pushed up and down between the client and the server.  The application was moving about 200+ KB of ViewState between the client and server.  This is something we picked out quickly as a scale issue.

    Now Tess has a great blog on why this is a problem from a memory perspective and provides some background on the issue:

    ASP.NET Memory – Identifying pages with high ViewState
    http://blogs.msdn.com/tess/archive/2008/09/09/asp-net-memory-identifying-pages-with-high-viewstate.aspx

    In this customer’s case we were seeing multisecond times for downloading a page from the server and sending the response back to the server so this was also having an impact on performance even before it because a memory problem.  They would run into the memory issues also once load is ramped up.

    The customer posed the question - “Well how do I know what is in my ViewState and what is causing the bloat?”

    This is a fair question and we referred to Tess’s blog but hit some snags that I wanted to comment on incase other people hit these. (read her blog first for background if you have questions)

    Where is the ViewState? 

    To get started we downloaded ViewState Decoder (2.2) but we could not point it directly at the page due to permissions issues. Therefore we went to pull the ViewState from the “View Source” of the page when viewing in IE.  A problem arose because as we navigated the page we would see the page sizes increase in the IIS logs (SC-BYTES value) but the page remained the same size when we did “View Source”.  Basically if we saved the “View Source” that page was identical even though a number of postbacks had occurred.  When I opened the IIS Log I saw each request and the log reported that SC-BYTES (Bytes Sent) was increasing (almost double the initial page request) so something was not adding up.

    In this application it has one page with multiple tabs and as we move through the tabs the page grows but as we said that was not reflected in the “View Source.”  It turns out after talking with the customer that they are using the Updater Panel from the AJAX toolkit and each of the postbacks is an update to that panel.  These updates are not reflected in the “View Source.” To capture this easily we returned to an old favorite – Fiddler from http://www.fiddlertool.com/fiddler/

    From there we were able to get the ViewState out.  Because this was being sent back to AJAX the ViewState showed up like this:

    |hiddenField|__VIEWSTATE|/wE
    <CLIP>
    EA==|9168|hiddenField|__VIEWSTATEENCRYPTED  

    You will notice that these are not normal tags.  Basically the ViewState continued between the “|”.  We grabbed this but and dropped it into the ViewState Decoder but got an error that the ViewState was invalid:

    There was an error decoding the ViewState string: The serialized data is invalid

    Why is the ViewState string invalid if it worked?

    We puzzled over this for a bit and then I noticed what you may already have noticed in the list of hidden fields:

    __VIEWSTATEENCRYPTED

    It turns out that this indicates that the ViewState is encrypted.  So it was understandable that the tool could not easily decode this via my cut and paste of the ViewState.  We decided to turn off ViewStateEncryption for this troubleshooting since the application was just in test.  Basically we add ViewStateEncryptionMode=”none” for this test to Page Directive on the page.  Then BINGO!  We were able to decode the ViewState and get some rich information.

    Here are some links on encryption:

    @ Page Directive
    http://msdn.microsoft.com/en-us/library/ydy4x04a.aspx

    ViewStateEncryptionMode
    http://msdn.microsoft.com/en-us/library/system.web.ui.viewstateencryptionmode.aspx

    For ViewStateEncryptionMode the default is auto which means if any of your control request encryption using the following API ViewState will get encrypted:

    RegisterViewStateEncryption 
    http://msdn.microsoft.com/en-us/library/system.web.ui.page.registerrequiresviewstateencryption.aspx

    All That Data, There Must be an Easier Way!

    Now as we scrolled through all of the data in the output we could see what is stored in the ViewState but tying back to a particular control is not simple.  Also, since there was so much stored in this ViewState it was impossible to easily know what controls were the worst offenders.  So we moved to one last place – Builtin ASP.NET tracing.  We enabled the tracing because there is a VERY helpful column that gets produced - ViewState Size bytes.  This is done for each control that is rendered and makes short order of what controls are the worst offenders.  You will get output like this:

    Control ID

    Type

    Render Size bytes

    ViewState Size bytes

    ControlState Size bytes

    Control1

    System.Web.UI.WebControls.DropDownList

    0

    1084

    0

    Control2

    System.Web.UI.WebControls.DropDownList

    0

    1084

    0

    Control3

    System.Web.UI.WebControls.DropDownList

    0

    1084

    0

    So there we had it!  Control names and their ViewState Size.  We were able to quickly look at who the worst offenders are and start removing the unneeded stuff and fixing the application.

    Reading ASP.NET Trace Information
    http://msdn.microsoft.com/en-us/library/kthye016(VS.80).aspx

    I hope this adds some additional information this discussion of troubleshooting ViewState.

    Have a great day!

    Zach

  • PFE - Developer Notes for the Field

    $(SolutionDir) and MSBUILD

    • 2 Comments

    A while back I was working with a customer that was moving from doing all of their builds with the devenv command line to using MSBUILD.  However we ran into a hiccup that did not make sense at first.  Basically when they tried to build their solution using MSBUILD some of the custom build steps where failing because $(SolutionDir) was not defined. 

    After thinking about it a bit is became clear that MSBUILD did not define the SolutionBuild property that Visual Studio does.  So where did that leave us?

    There are a couple of basic approaches:

    1. Create a Custom Action (this would involve Custom Code) that walked up the directory structure and found the SLN file and then used that directory to create the SolutionDir.
    2. Use the command line option /Property:SolutionDir=<sln Dir> to set the value.  This way it is controlled in one place.
    3. Change all of the pathing to be relative.  In most groups it is relatively infrequent to be moving projects up and down the directory hierarchy.

    Just thought I would share incase this comes up for someone else.

    Zach

  • PFE - Developer Notes for the Field

    All the Ways to Capture a Dump...

    • 4 Comments

    Frequently when troubleshooting a problem we capture a memory dump.  Memory dumps are a great tool because they are a complete snapshot of what a process is doing at the time the dump is captured.  It can be a great forensic tool.  There are however an abundance of tools to capture memory dumps (*.dmp).  In talking with some new members on our team we were reviewing the dump capture techniques and the benefits/drawbacks.  If you have your own thoughts please feel free to comment.

    Memory dumps are captured under two categories of problems: The first category is hang dump captured when the application is non-responsive or when you just want a snapshot of the current state of the application. The second category is a crash dump captured when exceptions occur. The debugging purpose, the actions we would instruct the debugger to take would be different for both types of dump.

    For hang dumps, we would instruct debugger to attach to the process (if there is no debugger attached) and initiate the dumping immediately. For crash dumps, we would like to attach a debugger to a process first and then set up the instruction to monitor the type of exceptions we are interested in and generate a memory dump upon the exception is raised. We will discuss different ways to generate memory dumps.

    First let's take a look at some of more specific scenarios when capturing a dump file (these are grouped for the most part into crash or hang based on what they relate to.  There are definitely a lot more variants but these are some basic ones that we can use to compare the different debuggers:

    1. Crash
      1. Create a dump file when an application crashes (an exception has occurred that causes the process to terminate)
      2. Creating a dump file from an application that fails during startup
    2. Hang
      1. Create a dump file when an application hangs (stops responding but does not actually crash)
      2. Creating a dump file while an application is running normally 
    3. First Chance - Creating a dump file when an application encounters any exception during the run of the application
    4. Smaller Dump - Shrinking an existing dump file to reduce the amount of information in the dump file

    For each of these scenarios how do the debuggers measure up:

    Debsuggers/Scenarios Crash Hang First Chance Smaller Dump
    DebugDiag Yes Yes Yes No
    ADPlus Yes Yes Yes No
    CDB/Windbg Yes Yes Yes Yes
    Dr. Watson Yes Yes No No
    UserDump Yes Yes Yes No
    NTSD Yes Yes Yes Yes
    Visual Studio Yes Yes Yes No
    Vista Crash Dump Features Yes Yes No No

    Next let’s take a quick look at how you might use each one and then we will explore a bit more:

     

    Crash

    Hang

    ADPlus

    Adplus –crash –quiet -o c:\dumps –pn <processname.exe>

    adplus -hang -quiet -o c:\dumps -pn <processName.exe

    Dr. Watson

       

    CDB and WinDbg

    .dump /ma c:\w3wp.dmp

    .dump /ma c:\w3wp.dmp

    UserDump

    see Additional Details below

    Userdump PID

    DebugDiag

    Create a Crash Rule with Wizard

    Right click on the process in Processes and select Full Memory Dump.

    MiniDumpWriteDump

    see Additional Details below

    see Additional Details below

    Visual Studio

    Debug | Save Dump As...

    Debug | Save Dump As…

    There are many ways to capture a memory dump as you can see above. Depending on the nature of the problem, a hang or crash dump can be captured using different tools.

    For quick and easy hang dump in production environment that installing tools might be an issue, using NTSD debugger that is distributed with Windows XP/Server 2003 can be easy.

    However we frequently recommend to our customers to have the Debugging Tools for Windows and DebugDiag available for their servers.  ADPlus has a very low footprint and can be great to capture memory dumps with.  DebugDiag is super easy and great for setting up rules to generate dumps.  Unless there is a special case DebugDiag is the way to go.

    Additional Details

    Below are some details for most of the tools that we outlined above and when they might come in handy.

    External Tools

    DebugDiag.exe from Debug Diagnostic Tool

    Where do I get it?

    DebugDiag is another free downloadable tool from Microsoft. To download, search for “Debug Diagnostic Tool v1.1” from http://www.microsoft.com/downloads or specifically http://www.microsoft.com/downloads/details.aspx?FamilyID=28bd5941-c458-46f1-b24d-f60151d875a3&displaylang=en

    How do I use it?

    Hang - In “Processes” tab, select the process with right mouse, select “Create Full User Dump”. The memory dump will be located at under Logs\Misc folder under the installation folder of DebugDiag.

    Crash - Select Rules tab. Click “Add Rule” button. Select “Crash” as the rule type. Click “Next” button. Select target type, for instance, “A Specific Process”. Click “Next” buttons to finish. DebugDiag will monitor crashes and when a memory dump is captured, it will show the number in the “UserDump” column of the Rules panel.

    DebugDiag supports multiple crash rules to monitor multiple processes. It also supports many other configurations. For more information, please refer to the help file that is installed with DebugDiag.

    Also see the DebugDiag blog for additional information – http://blogs.msdn.com/debugdiag/

    Debugging Tools for Windows (WinDBG, CDB, ADPlus)

    Where do I get it?

    This set of tools is downloadable for free from http://www.microsoft.com/whdc/devtools/debugging/default.mspx

    How do I use it?

    Assuming it will be installed to C:\Debuggers folder, we will use that folder for examples.

    Catch a hang dump using ADPLUS.vbs

    Adplus.vbs is a vbsript that makes your life easy to use cdb debugger. To capture a hang dump to c:\dumps folder from a running instance of NOTEPAD.exe that has process ID 1806, type:

    C:\debuggers>adplus -hang -quiet -o c:\dumps -p 1806

    To capture hang dumps to c:\dumps for all instances of NOTEPAD.exe, type:

    C:\debuggers>adplus -hang -quiet -o c:\dumps -pn notepad.exe

    Catch a crash dump using ADPLUS.vbs

    To catch a crash dump from a running instance of NOTEPAD with process ID 1806, type:

    C:\debuggers>adplus -crash -quiet -o c:\dumps –p 1806

    To catch crash dumps on all instances of NOTEPAD.exe, type:

    C:\debuggers>adplus -crash -quiet -o c:\dumps -pn notepad.exe

    Note that after starting adplus, it will quickly exit since all it does is setting up CDB debugger for the work. After adplus command window exits, you will notice CDB command window come up(minimized by default). Just wait till the CDB finishes. The command windows will close.

    Catch a crash dump on any crashing process using CDB.exe

    Since adplus utility can only catch crash dumps for targeted process, it cannot be used system wide to capture crash dump for any crashing process. If we want to overcome the limitation of Dr. Watson, we can use CDB debugger for this purpose. To do this, type:

    C:\debuggers> cdb -iaec "-c \".dump /u /ma c:\dumps\av.dmp;q\""

    This will configure CDB debugger as the default handler for crash by AeDebug registry key. You can verify the setting by browsing to registry key:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug

    And see these two values:

    Value name: Auto Value data: 1

    Value name: Debugger Value data: “c:\debuggers\cdb.exe” -p %ld -e %ld -g -c “.dump /ma /u c:\av.dmp;q”

    Except for a different debugger, this configuration is identical to the configuration described above for using NTSD.exe. Since NTSD.exe does not have the feature to configure AeDebug registry key, we have to do it manually in the previous scenario.

    The advantage of using CDB as system wide crash handler is that it can capture crash dumps for any process on Windows XP/Server 2003/Vista.

    There are a lot more features to discuss about the Debugging Tools for Windows and there are a lot of resources out there.

    Before we leave this topic another cool feature of both CDB and WinDBG is that you can load a dump file into them for analysis and if you want to generate a smaller dump file to share the callstacks or something basic with another person via e-mail for instance.  You can generate a minidump using this command while debugging the dump:

    .dump /m c:\smallerDump.dmp

    This can be very helpful for providing a bit of information without having to transfer a huge DMP file.

    UserDump.exe

    Crash Dumps and hang dumps can be taken with the User Dump tool.  For all of the details and download information check out http://support.microsoft.com/kb/241215

    Dumps and Development Time

    Now when you are developing there are a couple of cases where you want to look at dump generation.

    Visual Studio

    Where do I get it?

    You purchase it.

    How do I use it?

    To generate dump files you need to have the C++ tools installed which will enable you to do Native Debugging.  Once you enable Native Debugging then whenever you have “broken” into the debugger you can go to to the Debug menu and select Save Dump As… and that will allow you to save a dump file.  This can be very helpful on a developers machine when they encounter a problem in an application they are debugging and want to save it for later analysis.  Also, I have used this to capture dumps of Visual Studio when it is going weird.  Just pop open a second instance of VS and attach to the first and take a memory dump.

    MiniDumpWriteDump

    Where do I get it?

    This is an API that you would have to build into your application to enable your application to generate a dump file.

    How do I use it?

    This is an API to generate dumps so it cannot be used directly but would have to be coded into a tool or application.  So customers will actually bake this into their Native Unhandled Exception filter.  MiniDumpWriteDump can have a dump taken using the following code:

    #include <dbghelp.h>
    #include <shellapi.h>
    #include <shlobj.h>
    
    int GenerateDump(EXCEPTION_POINTERS* pExceptionPointers)
    {
        BOOL bMiniDumpSuccessful;
        WCHAR szPath[MAX_PATH]; 
        WCHAR szFileName[MAX_PATH]; 
        WCHAR* szAppName = L"AppName";
        WCHAR* szVersion = L"v1.0";
        DWORD dwBufferSize = MAX_PATH;
        HANDLE hDumpFile;
        SYSTEMTIME stLocalTime;
        MINIDUMP_EXCEPTION_INFORMATION ExpParam;
    
        GetLocalTime( &stLocalTime );
        GetTempPath( dwBufferSize, szPath );
        StringCchPrintf( szFileName, MAX_PATH, L"%s%s", szPath, szAppName );
        CreateDirectory( szFileName, NULL );
    
        StringCchPrintf( szFileName, MAX_PATH, L"%s%s\\%s-%04d%02d%02d-%02d%02d%02d-%ld-%ld.dmp", 
                   szPath, szAppName, szVersion, 
                   stLocalTime.wYear, stLocalTime.wMonth, stLocalTime.wDay, 
                   stLocalTime.wHour, stLocalTime.wMinute, stLocalTime.wSecond, 
                   GetCurrentProcessId(), GetCurrentThreadId());
    
        hDumpFile = CreateFile(szFileName, GENERIC_READ|GENERIC_WRITE, 
                    FILE_SHARE_WRITE|FILE_SHARE_READ, 0, CREATE_ALWAYS, 0, 0);
    
        ExpParam.ThreadId = GetCurrentThreadId();
        ExpParam.ExceptionPointers = pExceptionPointers;
        ExpParam.ClientPointers = TRUE;
    
        bMiniDumpSuccessful = MiniDumpWriteDump(GetCurrentProcess(), GetCurrentProcessId(), 
                        hDumpFile, MiniDumpWithDataSegs, &ExpParam, NULL, NULL);
    
        return EXCEPTION_EXECUTE_HANDLER;
    }

    From http://msdn2.microsoft.com/en-us/library/bb204861.aspx

    Existing tools for Windows XP/Server 2003

    Dr. Watson

    Where do I get it?

    It is found at %SystemRoot%\system32\drwtsn32.exe and is distributed with Windows XP/Server 2003 and earlier versions.

    How do I use it?

    Configure Dr. Watson by typing:

    c:\windows\system32>drwtsn32

    You can configure the location of the memory dump (the default location is C:\Documents and Settings\All Users\Application Data\Microsoft\Dr Watson\user.dmp), the type of memory dump (for dump analysis purpose, select “Full” for Crash Dump Type. The default type is Mini). The problem with Dr. Watson is that it overwrites the user.dmp file. If you have more than one process crash or the same process crash due to different reasons, you have to make sure to move the user.dmp file before it is overwritten.

    NTSD.exe

    Where do I get it?

    From %SystemRoot%\System32 folder and is a debugger distributed with Windows XP/Server 2003 and earlier versions.

    How do I use it?

    It’s a command line debugger and can be used to capture crash dump as well as hang dumps.

    Crash Dump - To set up NTSD as system wide handler for crash, add two values under under AeDebug registry key:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug

    Value name: Auto Value data: 1

    Value name: Debugger Value data: “ntsd.exe” -p %ld -e %ld -g -c “.dump /ma /u c:\av.dmp;q”

    Make sure to include the quotes in Value data.

    Upon process crash, OS will invoke NTSD.exe which will in turn attach to the process and capture a memory dump by executing the .dump debugger command.

    If you are curious what the command line means, the “-c” switch is for the debug command to be executed by NTSD debugger. The command “.dump” is for generating the memory dump. The switch “/ma” is for mini dump with all data streams. The “/u” switch is for the debugger to use a unique file name based on timestamp. The “;” delimits the first command. The second command “q” is for debugger to terminate the process and exit.

    Hang Dump - NTSD is a full featured debugger. It can be used to capture a hang dump given the process ID. For example, if NOTEPAD.exe is running with process ID 1806 (process ID can be obtained from task manager), you can type this command to capture a hang dump:

    C:\>ntsd -c “.dump /ma /u c:\hang.dmp;qd” -p 1806

    If you don’t like to type the whole command line again and again, you can create a batch file: HangDump.bat with this command:

    ntsd -c “.dump /ma /u c:\hang.dmp;qd” -p %1

    To use the batch file, type:

    C:\>HangDump 1806

    Generating dumps with Vista

    Vista does not ship with Dr. Watson or NTSD. Instead, Vista has “Problem Reports and Solutions” which is located under Control Panel->System and Maintenance.

    Capture a hang dump in Vista

    Inside Task Manager, select the process , right mouse click “Create Dump File”.

    clip_image002

    Capture a crash dump in Vista

    For process crash, Windows Error Reporting (WER) will try to analyze the crash first and then display this message box(where BADAPP is the crashing application):

    clip_image004

    When clicking “Close program”, the process will exit. By default no crash dump file will be kept on the local system. However, if you wish to retain it, you can modify this registry value ForceQueue to 1, which is located at:

    HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\Windows Error Reporting

    For dump files running in system context or elevated are located at:

    %ALLUSERSPROFILE%\Microsoft\Windows\WER\[ReportQueue | ReportArchive]

    Dump files from other processes are located at:

    %LOCALAPPDATA%\Microsoft\Windows\WER\[ReportQueue | ReportArchive]

    Starting with Vista SP1 and Windows Server 2008, more features for crash dumps will be supported. For more information, see Collecting User-Mode Dumps at http://msdn2.microsoft.com/en-us/library/bb787181(VS.85).aspx

    Credits

    A lot of people contributed thoughts comments and work to this so I just want to provide them with credit:

    Linkai Yu who did the bulk of the work. (I will introduce you to him in a subsequent post!)
    Michael Wiley, Pierre Donyegro, David Wu, Greg Varveris, Sean Thompson and Aaron Barth who contributed and helped review.

  • PFE - Developer Notes for the Field

    Debugging Internet Explorer Security Warnings

    • 2 Comments

    My name is Norman and I’ve been working with customers the past few years debugging a variety of problems, but maintaining a focus on Internet Explorer. This was an interesting issue I ran into with a customer the other day.

    ===================

    I know no one has ever been in a situation where you typed credit card information into a “secure” SSL web site and hit the Submit button only to the see the below security warning.

    clip_image002

    Personally speaking I’d cancel my order and never shop there again. This, though, is a pretty common scenario as of Internet Explorer 6 and is meant to help you. It is not telling you there is an error. Think of it as a warning sign of things to come.

    If you are lucky enough to have to debug this, it is pretty easy to do assuming you have some debug tools handy. Just grab a dump file or attach to the process (iexplore.exe) using your favorite debug tools.

    Steps

    Please note the register and hex values will vary.

    1. Switch to thread 0

    This wasn’t a wild guess although you could dump out all threads and look for a LaunchDlg function; rather all dialogs are on thread 0 in a windows application.

    0:013> ~0s
    eax=00600650 ebx=00000000 ecx=00422dc0 edx=7c90eb01 esi=0062d298 edi=00000001
    eip=7c90eb94 esp=001399f4 ebp=00139a28 iopl=0 nv up ei pl zr na pe nc
    cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00000246
    ntdll!KiFastSystemCallRet:
    7c90eb94 c3 ret

    2. Show the full callstack

    0:000> kb100
    ChildEBP RetAddr Args to Child
    001399f0 7e419418 7e42dba8 00220324 00000001 ntdll!KiFastSystemCallRet
    00139a28 7e42593f 001503fa 00220324 00000010 USER32!NtUserWaitMessage+0xc
    00139a50 7e425981 771b0000 772431c0 001103d6 USER32!InternalDialogBox+0xd0
    00139a70 7e42559e 771b0000 772431c0 001103d6 USER32!DialogBoxIndirectParamAorW+0x37
    00139a94 77fa9eb1 771b0000 0000049b 001103d6 USER32!DialogBoxParamW+0x3f
    00139ab4 7722f51a 771b0000 0000049b 001103d6 SHLWAPI!DialogBoxParamWrapW+0x36
    0013a52c 7722dce4 001103d6 0013a55c 0000049b WININET!LaunchDlg+0x6c1
    0013a578 7dd35417 001103d6 00000000 7722f673 WININET!InternetErrorDlg+0x34f
    0013a5c0 7deaea5f 00000000 0013a678 00000001 mshtml!CMarkup::ValidateSecureUrl+0xf3
    0013e780 7de92e8b 7dcd0709 02d589e0 10a9ffa1 mshtml!CObjectElement::CreateObject+0x48d
    0013e784 7dcd0709 02d589e0 10a9ffa1 00000000 mshtml!CHtmObjectParseCtx::Execute+0x8
    0013e7d0 7dc9cf87 02d58ac0 02d589e0 7dcc4bad mshtml!CHtmParse::Execute+0x41
    0013e7dc 7dcc4bad 7dcc4bcb 10a9ffa1 02d589e0 mshtml!CHtmPost::Broadcast+0xd
    0013e898 7dcb4c7b 10a9ffa1 02d589e0 02d402d0 mshtml!CHtmPost::Exec+0x32f
    0013e8b0 7dcb4c20 10a9ffa1 02d402d0 02d589e0 mshtml!CHtmPost::Run+0x12
    0013e8c0 7dcb505f 02d402d0 10a9ffa1 02d589e0 mshtml!PostManExecute+0x51
    0013e8d8 7dcb4fe2 02d589e0 00000001 7dcb4038 mshtml!PostManResume+0x71
    0013e8e4 7dcb4038 02d58a50 02d589e0 0013e928 mshtml!CHtmPost::OnDwnChanCallback+0xc
    0013e8f4 7dc9cb7d 02d58a50 00000000 00000000 mshtml!CDwnChan::OnMethodCall+0x19
    0013e928 7dc98977 0013eac4 7dc98892 00000000 mshtml!GlobalWndOnMethodCall+0x66
    0013ea5c 7e418734 001103d8 00008002 00000000 mshtml!GlobalWndProc+0x1e2
    0013ea88 7e418816 7dc98892 001103d8 00008002 USER32!InternalCallWinProc+0x28
    0013eaf0 7e4189cd 00000000 7dc98892 001103d8 USER32!UserCallWinProcCheckWow+0x150
    0013eb50 7e418a10 0013eb90 00000000 0013eb78 USER32!DispatchMessageWorker+0x306
    0013eb60 75f9d795 0013eb90 00000000 00162f90 USER32!DispatchMessageW+0xf
    0013eb78 75fa51da 0013eb90 0013ee98 00000000 BROWSEUI!TimedDispatchMessage+0x33
    0013edd8 75fa534d 00162d68 0013ee98 00162d68 BROWSEUI!BrowserThreadProc+0x336
    0013ee6c 75fa5615 00162d68 00162d68 00000000 BROWSEUI!BrowserProtectedThreadProc+0x50
    0013fef0 7e318d1e 00162d68 00000000 00000000 BROWSEUI!SHOpenFolderWindow+0x22c
    0013ff10 00402372 001523ba 000207b4 00094310 SHDOCVW!IEWinMain+0x129
    0013ff60 00402444 00400000 00000000 001523ba iexplore!WinMainT+0x2de
    0013ffc0 7c816fd7 00094310 0079f0e0 7ffd7000 iexplore!_ModuleEntry+0x99
    0013fff0 00000000 00402451 00000000 78746341 kernel32!BaseProcessStart+0x23

    3. Dump out the second parameter to mshtml!CMarkup::ValidateSecureUrl

    0:000> du 0013a678
    0013a678 "http://download.macromedia.com/p"
    0013a6b8 "ub/shockwave/cabs/flash/swflash."
    0013a6f8 "cab#version=6,0,79,0"

    For more information on what ValidateSecureUrl does, please click here.

    4. Update the code

    So in my customer’s case it wasn’t their custom code. It was really just a reference to a Macromedia object. The easy fix for my customer was to change the “http” to “https” and in the eyes of their Internet customer had a secure site again.

  • PFE - Developer Notes for the Field

    Investigate High CPU for a Process: Part 1

    • 0 Comments

    Hello World! This is my 1st ever blog and I hope there is more to come after this. So my name is Leo Leong and I am one of among many in the field that goes out to attend situations when there is a dev issue. In many ways, I've always thought that we are like the "mighty mouse" in our field, "Here I come to save the day...". Well, it's not always like that but that's how I'd like to think of us :)

    Alright, enough said and let's get back on track with the real deal. So the real world has proven to be a much more complicated arena but for simplicity sake I've created an academic sample here to illustrate what we are going to accomplish. And that is to find out what a process is doing when you see it is taking up much of your system's CPU. Imagine the following scenario:

    I have develop a web site and whenever I browse to a page, it pegs my machine's CPU. What is it doing?

    Basically, there are quite a few tools that come to mind but I will introduce 2 of them in this 1st part because they are platform (x86/x64) and .NET version agnostic. But more importantly, we can't use Visual Studio!

    The first tool is called Process Explorer. You can find more details and download here. It is a very GUI driven tool and does a good job in terms of presenting what you might be interested within any given processes. Here's what it looks like when I looked at the stack trace for the worker process's (w3wp.exe) threads and specifically the thread with the highest CPU:

    HighCPU_Part1_ProcExp

    Note, you will have to configure the symbols to be able to see the function names shown in the above screen shot. To do that simply go to Options | Configure Symbols... and put in the Symbols path such as the following:

    HighCPU_Part1_ProcExp_Symbols

    I've made a number of assumptions here and that is you understand what a process, thread and stack means. Also, I'm assuming you are aware of what managed and native code is. But regardless, the essence of this tools is that it gives you a stack trace of any threads within a process which makes it a very powerful tool.

    So, we are half way through this problem and discovered what the stack looked like for the thread in question causing high CPU. Because this is a .NET application (more accurately ASP.NET), we need something more to dig in further. Hence, we bring out the next tool called Windbg. More details and download can be found here. This is pretty much the de facto tool used within Microsoft to debug any issues and you'll hear this a lot from us. Now to explain all the details of what to do when debugging a managed application is yet again outside the scope but here's the command you can use to get the stack traces for all managed threads:

    ~*e!clrstack

    When you attach the debugger (Windbg) to the worker process and issue the above command, this is roughly what it will look like:

    HighCPU_Part1_Windbg

    The command used above is part of the manage extension called SOS and you will need to load it in the debugger before you can debug any manage application. In case you are not already familiar with, try the following to load SOS:

    .loadby sos mscorwks

    So, what can we conclude so far? Well, save to say the function HighCPULoop_Click is the root cause of the problem here. A for loop that goes from 0 to int.MaxValue/2 which is about 1 billion, is a lot to loop through!! Once again, this is only an academic sample that I've wiped up in my spare time to represent a high CPU scenario. There are lots of reason why a good for loop can go bad :)

  • PFE - Developer Notes for the Field

    WinDBG and Hangs When Debugging Managed Dumps

    • 0 Comments

    Because this is my first post on this blog, let me introduce myself.  My name is John Allen.  I’ve been on the PFE team for 5 years and been with Microsoft for 9 years.  All of those years I’ve been debugging and troubleshooting all kinds of customer applications.   I focus on all developer technologies and products.

    One of the main things that myself and the others on the PFE - Developer team do is capture and analyze memory dump files.  A majority of the time we are debugging .NET applications and have found that sometimes the debugger will hang while trying to analyze a dump.  There are a couple of key characteristics:

    1. This is a 100% CPU Hang and the debugger does not recover - If your machine has more than one CPU the debugger will only be fully utilizing one of them.
    2. The Hang occurs when loading symbols for a managed assembly
    3. The Application that was dumped is .NET Framework 2.0 or greater.

    We will talk more about symbol loading in a subsequent post but here is a workaround for this problem:

    1. Create an empty SymSrv.ini file in your Debuggers Directory (Default - C:\Program Files\Debugging Tools for Windows).  For more information check out the Symsrv.ini documentation on MSDN - http://msdn2.microsoft.com/en-us/library/ms681416.aspx
    2. Then open that file and add the following:
    [exclusions] 
    
    System.Windows.Forms.pdb
    System.pdb
    System.Web.pdb

    After you have done this you can try loading the dump file again and see if the hang occurs again.  If this does not resolve your problem in run the following command:

    0:020> lm m *_ni start end module name 
    
    637a0000 63d02000 System_Xml_ni (deferred) 64890000 6498a000 System_Configuration_ni (deferred) 65140000 657a6000 System_Data_ni (deferred) 65f20000 66ac6000 System_Web_ni (deferred) 698f0000 69ad0000 System_Web_Services_ni (deferred) 790c0000 79b90000 mscorlib_ni (deferred) 7a440000 7ac06000 System_ni (deferred)

    This will dump all of the native image assemblies - these are the assemblies that have been nGened and are loaded from the nGen cache.  Then for each assembly that is listed add them to the [exclusions] list in the SymSrv.ini file.  Be sure to remove the _ni, replace it with .dll and replace the remaining underscores with a dot.  For instance in the above output the [exclusion] list would look like:

    [exclusions] 
    System.Xml.pdb 
    System.Configuration.pdb 
    System.Data.pdb
    System.Web.pdb System.Web.Services.pdb
    mscorlib.pdb
    System.pdb

    Finally, if you are looking for additional information on managed debugging you should check out Tess's blog - http://blogs.msdn.com/tess/

    Thanks,
    John Allen

     

  • PFE - Developer Notes for the Field

    AssemblyResolve Event and VJSharpCodeProvider

    • 9 Comments

    This one kept me up late the other night.  We had an issue where ASP.NET was recompiling a page and we were getting an exception:

    System.Exception:

    Error loading VJSharpCodeProvider, Version=2.0.0.0, Culture=neutral,PublicKeyToken=b03f5f7f11d50a3a

    This was interesting for two reasons:

    1. The application does not use J#.
    2. This is more subtle but nagged the back of my mind - Why is this thrown as a System.Exception and not a System.IO.FileNotFoundException or something more explicit since the framework typically throws specific exceptions.

    Some additional background - this problem was only occurring during acceptance testing and did not occur in dev.  They were able to load up the page and successfully run through the application.  Then after stressing the application they would eventually start getting these J# errors and other Dynamic compilation errors from ASP.NET.  This continued until they restarted IIS and cleared the Temporary ASP.NET folder.

    They did not have the J# Runtime installed on the system in dev or test and this was completely excepted because they were not using J#.  I got a look at the application and verified this.

    Side Note - I have been doing product support for all 6 years that I have been working at Microsoft.  For those of you out there doing similar stuff I am sure you know this and if not take heed - always verify things when working to resolve issues like this.  It is not that anyone out there is intentionally hiding things but people frequently are unaware and when you ask, "Is such and such the case?" they may very well answer incorrectly.  So verify and double check.  Part of having someone from support come in is to bring a fresh perspective that analyzes the problem from the outside and that can frequently be invaluable.

    So there is no J#.....Then why is this even trying to load?

    I am glad you would ask that because I have an answer.  Deep in the bowels of the ASP.NET compile engine and the Code DOM we enumerate all of the "Code Providers" and see if they are valid.  Part of this test is done by trying to load the assemblies.  So the default code providers are:

    • C#
    • VB.NET
    • JavaScript
    • J#
    • MC++

    You can actually see these defaults documented in the machine.config.comments file in your %windir%\Microsoft.net\framework\v2.0.50727\config folder:

    <compilers>
        <compiler language="c#;cs;csharp" extension=".cs" type="Microsoft.CSharp.CSharpCodeProvider, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
        <compiler language="vb;vbs;visualbasic;vbscript" extension=".vb" type="Microsoft.VisualBasic.VBCodeProvider, System, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089" />
        <compiler language="js;jscript;javascript" extension=".js" type="Microsoft.JScript.JScriptCodeProvider, Microsoft.JScript, Version=8.0.1100.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />
        <compiler language="vj#;vjs;vjsharp" extension=".jsl;.java" type="Microsoft.VJSharp.VJSharpCodeProvider, VJSharpCodeProvider, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />
        <compiler language="c++;mc;cpp" extension=".h" type="Microsoft.VisualC.CppCodeProvider, CppCodeProvider, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a" />
    </compilers>

    Those are the 5 that will get searched by default and you can add additional code providers in your config file (you cannot remove from these 5 defaults ones as far as I can tell).

    All of this said it would seem that every application that does a dynamic compile must be looking for this J# provider.  So I went to do a quick test.  I created a simple ASPX page that did a dynamic compile.  Before I ran the page I when to the Fusion Log Viewer (http://msdn2.microsoft.com/en-us/library/e74a18c4(VS.80).aspx).  This tool will allow me to see what happens.  I clicked "Settings..." and checked "Log Binding Failures to Disk" and exited the application.  Then I did an IIS Reset to make sure that the setting was picked up and then I browsed to my page.  After the page came up I reopen the Fusion Log Viewer and sure enough there was a bind failure for the VJSharpCodeProvider.  If you do these same steps assuming you do not have the J# runtime installed you should see the same. 

    I am seeing the same failure to load the VJSharpCodeProvider but I did not any errors and the dynamic compilation did not fail.  This however is similar to what they see - It works at first and then later breaks...Reviewing briefly - We know that the VJSharpCodeProvider is not on their system but the failure to load this file is normal so why does it start becoming a problem?

    Well this is where their ResolveEventHandler (http://msdn2.microsoft.com/en-us/library/system.resolveeventhandler.aspx) comes in.  We found in one event log entry where they log the callstack that a ResolveEventHanlder was on top where they subscribed to the AssemblyResolve Event - http://msdn2.microsoft.com/en-us/library/system.appdomain.assemblyresolve.aspx.  From their we began looking into the code. 

    First off for some background - The AssemblyResolve Event is part of the Assembly loading that .NET does.  After it has searched for and failed to find an assembly it will raise this event.  The application can then explicitly load (typically using LoadFrom or something) the assembly and return a reference to the assembly.  This is essentially a way to redirect the application to an assembly.  Once we go the code we found that they filtered out a couple of specific exceptions and then on any that they did not have a location for they just threw a System.Exception.  Ahhhhh, that explains the nagging question about why the exception type was System.Exception and not something else.

    We took a look back at the AssemblyResolve Event and the docs have a glaring oversight - What do you do when you receive a request of an assembly that you do not know where it is?  The answer turns out to be - Return Null.  I verified this internally and some in the community has been nice enough to add this note in the "Community Content" section of the docs -http://msdn2.microsoft.com/en-us/library/system.appdomain.assemblyresolve.aspx

    In conclusion, when the application is first launched their code has not executed yet therefore the ResolveEventHandler has not been hooked up to the AssemblyResolve event.  However later in the run of the application when a recompile occurs and we reprobe for the VJSharpCodeProvider the exception is thrown and the ASP.NET Compile code is not expecting and exception do this leads to other errors.

    Once we fixed up the handler to return NULL the application worked.  It sailed through testing and I got to go to bed!!  I hope that this helps someone else out.

    Thanks,
    Zach

  • PFE - Developer Notes for the Field

    ASP.NET and Unit Tests

    • 1 Comments

    I was onsite the other day and the customer wanted to use Visual Studio 2005 to auto generate the Unit Test stubs for their ASP.NET application.  They have a lot of rules tied up in the ASP.NET application project.  When we tried to generate the Unit tests we kept getting the following error:

    Source Code cannot be sufficiently parsed for code generation.  Please try to compile your project and fix any issues.

    Well, needless to say the project compiled fine (and had been for a while) and ran fine.  We could not find any issues to fix.  My first thought was that this was a limitation because of the way they created their ASP.NET project.  So we got some details:

    1. Originally Created in VS.NET 2003
    2. Migrated to VS 2005
    3. They did not convert to the new Page Model with App_Code and such.

    After doing some digging I found out that the Unit Test Generation for some reason expects that the project has at least an empty "Settings."  So we opened the properties on the project and clicked Settings.  Sure enough there was nothing there.  So we clicked the link to create the "Settings" and got the grid that would enable us to add settings.  We just left it blank and rebuilt.  At that point we retried the generation of the Unit Tests and everything went smoothly.

    Thanks,
    Zach

Page 3 of 4 (76 items) 1234