John's Blog

  • Automatically Launching Local Apps in a Virtual Environment

    This article will describe a method for having a locally-installed process automatically launch inside a virtual environment. This method does not require publishing new shortcuts for the local application.

    Background

    App-V customers often have interesting, advanced scenarios that require locally installed applications to run inside of a virtual environment. For example, let's say that the customer has a web application that requires an ActiveX control in order to function correctly. However, they don't want to allow their users to install / upgrade ActiveX controls for security reasons. And, they don't want other websites to have access to this ActiveX control, in case it has any security vulnerabilities that a malicious website could use to attack their enterprise. What can they do? How can they deploy / service this app's ActiveX control safely? At first glance, it may seem like App-V can't be helpful in this situation, since Internet Explorer is locally installed.

    It turns out that App-V can be used to help solve this deployment / servicing problem. This topic has already been covered in other articles. The basic technique is as follows:

    1. Create a sequencing that contains just the ActiveX control(s)
    2. Create an OSD for this package that launches Internet Explorer and takes the user directly to the web application's URL
    3. Publish a shortcut that the user clicks on when they want to run the web application

    When the user clicks on the shortcut for the web application, Internet Explorer will launch inside the virtual environment of the ActiveX control and the web application will be fully functional. This approach has a few key benefits:

    1. It normalizes the deployment / servicing model for the enterprise, allowing ActiveX controls to be deployed and serviced just like standard applications.
    2. The user can run the application without having to be given permission to install ActiveX controls.
    3. Regular browser sessions cannot run the ActiveX control, preventing malicious websites from leveraging any security vulnerabilities that may exist in the control.
    4. It integrates the web application into Windows, making it seem more like a regular application. The user no longer needs to launch the browser and then browse to the app's URL. The shortcut that is published takes care of this when they click on it.

    With the advent of Dynamic Suite Composition in App-V 4.5, this approach has been pushed even further. Now, instead of having Internet Explorer launch in the Virtual Environment of a single package, some customers have started running IE run in a Virtual Environment that contains components from multiple App-V packages. The approach taken is similar to what was done previously, with a couple of additional steps. The basic technique is as follows:

    1. Create an empty sequencing that represents the settings for Internet Explorer. Have the OSD for this package launch Internet Explorer. This package will be the primary package that stores all of the settings for virtual Internet Explorer.
    2. Create additional sequencings, one for each IE plug-in you want to make available to your users. Make these packages optional secondary packages of the primary Internet Explorer package that you created in step 1.
    3. Use the standard entitlement mechanism in App-V to control which plug-ins are available for different users.

    This technique has all of the same benefits of the previous one, but provides additional entitlement and servicing granularity. However, some customers don't like having to publish a separate shortcut for the local application (in this case, Internet Explorer) in order to leverage it. In some cases, they don't want to have to train their users to click on a different shortcut. Or, in the case of complex applications, like Microsoft Office, there are just too many ways that the application can be launched other than through its shortcuts, such as the myriad file type associations that Office publishes. What they're looking for is the ability to express that a locally-installed application should always run inside of its virtual environment, regardless of how it gets launched.

    This article will describe how to accomplish this scenario.

    Overview

    If you've ever been curious enough to look at the shortcuts that App-V publishes for virtual applications, you may have noticed something interesting. The shortcuts don't point directly to the app that is going to be launched. Rather, they point to App-V launcher process (sfttray.exe) and pass a parameter that tells the launcher which application should be launched. So, for example, let's say you have virtualized Microsoft Word 2007. If you look at the properties of the Microsoft Word shortcut, you might expect to see a path to the exe, something like: Q:\office\office12\winword.exe. However, instead, you will see something like: "C:\Program Files\Microsoft Application Virtualization Client\sfttray.exe" /launch "Microsoft Office Word 2007". This is the mechanism that App-V uses to ensure that the Virtual Environment for the application gets set up before the process is launched. It is the only supported way for a non-virtual process to launch a process inside of a virtual environment.

    So, we need to find a way to have the launch of a local process trigger this mechanism, so that it will come up inside a virtual environment. To do this, we will need to leverage some developer-focused features of Windows and some custom C/C++ code.

    Intercepting Process Creation

    To achieve our goal, the first thing we need to be able to do is to intercept local process creation and change what happens during launch. In order to do this, we can leverage a developer-focused feature of Windows called Image File Execution Options. If you're interested in reading more about Image File Execution Options, read this article. Basically, Image File Execution Options allows control over certain aspects of a process' launch. The option we're interested in using is the Debugger option. This option is designed to allow developers to intercept the creation of a process and start the process in a debugging session. However, there is no reason that we can't leverage this mechanism to achieve our non-debugging goals.

    The Dangers of Infinite Recursion

    Before I discuss how to use the Debugger option, I need to point out one danger with using it. The Debugger option will catch every process launch for a particular binary name (e.g. winword.exe) and replace the launch with another process. So, if I specify replace.exe as the Debugger for original.exe, when I try to launch original.exe, replace.exe will actually get launched instead. This may seem fine, since it is what we want to happen. But, what happens if replace.exe eventually tries to launch original.exe? You guessed it, replace.exe will actually get launched. And, if that instance of replace.exe eventually tries to launch original.exe, replace.exe will get launched again. In programming, this process of something calling itself is called recursion. If it continues without stopping, it is called infinite recursion. As you can probably guess, infinite recursion is usually a very bad thing.

    The reason that I bring up the concept of infinite recursion is that naïve usage of Debugger can result in infinite recursion and recovering from this can be a pain. In fact, the simplest way to recover is typically to reboot (there are more advanced ways of recovering, such as using pskill.exe from Sysinternals, but I'm not going to get into that in this article).

    Avoiding Infinite Recursion

    There are several ways to avoid this type of infinite recursion in practice. While most of them require you to write custom code, the way I'm going to demonstrate does not require any custom code to be written. Instead, it relies on App-V's Virtual Registry to hide the Debugger value inside the Virtual Environment, thereby breaking the recursion cycle. To do this, we simply add some elements to the OSD file for the primary package that tell App-V to hide Debugger value. Here is a sample that will hide the Debugger value for notepad.exe, which would be stored under the following registry key: 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\notepad.exe'.

    <SOFTPKG …….>

        <IMPLEMENTATION>

            <VIRTUALENV …….>

                <REGISTRY>

                    <REGKEY HIVE="HKLM" KEY="SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\notepad.exe">

                        <REGVALUE NAME="Debugger" REGTYPE="REG_SZ"></REGVALUE>

                    </REGKEY>

                </REGISTRY>

            </VIRTUALENV>

        </IMPLEMENTATION>

    </SOFTPKG>

    As you can see in this sample, a <REGISTRY> element was added under the existing <VIRTUALENV> element. Then, REGKEY and REGVALUE elements were added to set the appropriate value in the registry. Remember that the key is specific to the name of the binary being launched. So, if you wanted to change the binary in the local registry, you absolutely need to update the OSD file according and make sure that App-V gets receives the update by triggering a publish refresh operation.

    Setting the Debugger Value

    Now that we have guarded against the danger of infinite recursion, it is time to set the Debugger value in the local registry. To set this value, you need two pieces of information:

    1. The name of the exe that you want to intercept
    2. The name that you gave to the app in the OSD of the primary package

    For our example, let's assume that we are trying to intercept notepad.exe and we want it to run inside the Virtual Environment represented by an app named "Virtual Office". To do this, we would create a key in the local registry named: 'HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\notepad.exe'. Next, we would create a new String Value and name it Debugger. Then we would set the Debugger value to:

    sfttray.exe /exe sftlp.exe /launch "Virtual Office"

    I'm not going to go into the details of what this command does. But, basically, it launches notepad.exe in the Virtual Environment of the Virtual Office app.

    Summary

    In summary, the steps you need to take to automatically launch a locally-installed app inside a virtual environment are:

    1. Create a sequencing that contains the virtual environment you want the local app to run inside of. Remember the name you gave the app (in the OSD). For our example, we called it Virtual Office.
    2. Update the OSD with the correct REGISTRY tags to prevent infinite recursion.
    3. Create the appropriate Debugger value in the local registry.

    Once you have performed these steps, the app will always get created in the appropriate Virtual Environment.

  • HTTP Publishing in App-V (Part 1)

    With all the excitement around HTTP streaming in App-V 4.5, many people fail to realize HTTP Publishing has been a feature of App-V since…well…the very first release of SoftGrid. This series of articles will discuss how to build your own App-V publishing server. In this first article, I will go over how to create a very basic publishing server. In future articles, I will discuss advanced features that can be enabled using a custom HTTP publishing server.

    Overview

    HTTP Publishing is a surprisingly powerful feature. It provides you with complete control over which applications are delivered to each user. The criteria that you choose to make this decision are completely up to you. For example, you could base it on the ACLs associated with the package files. With this strategy, if the user has access to package files, then the applications in the package are published to the user. This simple policy provides similar entitlement functionality that is available with the App-V Management Server. It is the one that will be demonstrated in this article.

    Much more interesting policies are also possible. For example, entitlement could be based on information in the corporate HR database. Or, it could be based on some other attributes on the user's Active Directory object. Another option would be to query some custom SQL database to determine which applications the user is entitled to use. Which option you choose will depend on the needs of your particular scenario.

    Publishing Architecture

    The publishing architecture in App-V is fairly simple. The client sends a request to the server asking for the list of applications for the user. This request is sent while impersonating the user, so that the server can access the user's token, if needed.

    The server's response is a single XML file that contains the publishing information associated with each application, including its shortcuts, file type associations and DDE entries. The rest of this article describes how to generate this document in the appropriate format.

    Generating the Publishing Document

    Prior to App-V 4.5, creating a Publishing Server was a reasonably complicated task. It required you to generate a properly formatted XML document describing all of the applications, shortcuts, file type associations and DDE entries for each package that a user was entitled to use. App-V 4.5 introduced the concept of package manifest files, which contains all of this information in the correct format. This makes creating an HTTP publishing server a much easier task.

    The following is an example publishing document that contains the information for a single application. While it contains a lot of information, most of it comes directly from the package's manifest file. So, you will not have to deal with it directly. It is included purely for illustrative purposes.  Click on the image to see a larger version.

     

    As can be seen in this sample, the document element is called <DESKTOPCONFIG>. It contains two child elements, <POLICY> and <APPLIST>. The <POLICY> element has a <REFRESH> element that allows you to specify the Publishing Refresh frequency, in minutes and a boolean that determines if Publishing Refresh occurs when the user first logs in.

    The <APPLIST> section is nothing but a concatenation of the <APPLIST> sections from each of the package manifests to which the user has access. The sample code that follows describes how to generate this list.

    The following is the ASPX page that will be used in this sample. It contains the basic structure of the Publishing document.

    Copyright (c) Microsoft Corporation.  All rights reserved.  For personal use only.  Provided AS IS and WITH ALL FAULTS.  
    

    <%@
    										Page
    										Language="C#"
    										AutoEventWireup="true"
    										CodeBehind="publishing.aspx.cs"
    										Inherits="appv_publishing_service.publishing"
    										ContentType="text/xml"
    										%> 
    								

    <DESKTOPCONFIG> 
    								

    <POLICY
    										MANAGEDDESKTOP="TRUE"
    										REPORTING="FALSE"> 
    								

    <REFRESH
    										ONLOGIN="TRUE"
    										PERIOD="60"/> 
    								

    </POLICY> 
    								

    <APPLIST> 
    								

    <%
    										this.generate_app_xml(); %> 
    								

    </APPLIST> 
    								

    </DESKTOPCONFIG>


    The <POLICY> section has already been described above. The <APPLIST> element is where all of the application-specific publishing information is placed. This information is taken directly from the manifest files that were generated by the Sequencer. In this sample ASPX page, the work of extracting the publishing information from the manifest files is delegated to a C# method in the CodeBehind file, named generate_app_xml(). The contents of the CodeBehind file are listed below.

    Copyright © Microsoft Corporation.  All rights reserved.  For personal use only.  Provided AS IS and WITH ALL FAULTS.  
    

    // CodeBehind File: publishing.aspx.cs 
    								

    using System; 
    								

    using System.Xml; 
    								

    using System.IO; 
    								

    namespace appv_publishing_service 
    								

    { 
    								

    public
    										partial
    										class
    										publishing : System.Web.UI.Page 
    								

    { 
    								

    protected
    										void generate_app_xml() 
    								

    { 
    								

    string root_phys = MapPath("."); 
    								

    string[] dirs = Directory.GetDirectories(root_phys); 
    								

    foreach (string dir in dirs) 
    								

    { 
    								

    try 
    								

    { // Best effort. 
    								

    string[] manifests = Directory.GetFiles(Path.Combine(root_phys,dir), @"*_manifest.xml", SearchOption.TopDirectoryOnly); 
    								

    foreach (string filename in manifests) 
    								

    { 
    								

    FileStream file = File.OpenRead(filename); 
    								

    XmlDocument doc = new
    										XmlDocument(); 
    								

    doc.Load(file); 
    								

    XmlNode applist = doc.SelectSingleNode(@"/PACKAGE/APPLIST"); 
    								

    Response.Write(applist.InnerXml); 
    								

    } 
    								

    } 
    								

    catch (Exception ex) 
    								

    { 
    								

    // Output to log file 
    								

    Response.AppendToLog(ex.Message); 
    								

    } 
    								

    } 
    								

    } 
    								

    } 
    								

    } 

    The algorithm used by the generate_app_xml() method is pretty straightforward. It assumes that the packages are each stored in separate subdirectories under the current directory. Many other schemes are possible. This particular one was chosen for its simplicity.

    The method begins by calling the MapPath() function to get the native file system path of the current directory. It then uses the GetDirectories() static method of the Directory object to get a list of all of the subdirectories under the current directory. Next, it enumerates the list of subdirectories so it can find the manifest file for each package.

    My original implementation of this method attempted to just use the GetFiles() static method of the Directory object to get a list of all of the manifest files under the current directory. However, it turns out that GetFiles() has one particularly annoying characteristic (I'd call it a bug). If GetFiles() encounters a directory that it doesn't have access to during the search, it abandons the operation and throws an ACCESS_DENIED error. This severely limits the usefulness of this method.

    To get around this limitation, the GetFiles() call is done in within a try…catch block. It is used to get a list of all of the manifest files under each subdirectory. Normally, we'd only expect there to be a single manifest file in each subdirectory. But, if there were more than one, this method would handle each one. If the user doesn't have access to the subdirectory, the exception thrown by GetFiles() is caught by the method, logged (for debugging purposes) and the enumeration continues.

    The real work of this method is done with the inner foreach that processes the manifest files. Each manifest file is opened and loaded into an XmlDocument object. Next, the "/PACKAGE/APPLIST" element is selected and its contents are written out to the response.

    The end result of this processing is an XML document that contains all of the information for each package that the user has permission to use. Pretty simple, eh? To test your implementation, you can simply point your web browser at the publishing.aspx page and check out the results. Remember to enable Windows Integrated Authentication on the website and ACL the package directories appropriately.

    Configuring the Client

    Once the publishing.aspx page is working correctly, all you need to do to use it is to configure the client to point at the page. Here is an example of how to correctly configure the client:

     

    In this example, the client will connect to the 'webserver.yourcompany.com' webserver over port 80 and retrieve the contents of the 'publishing.aspx' page. It will then process the XML document that is returned by the server.

    Note that this example uses the standard HTTP protocol, which will not verify the identity of the webserver and will return the data unencrypted. This is only appropriate for scenarios that are inside the corporate firewall. If this were a public Internet scenario, you would use the HTTPS protocol to guard against man-in-the-middle attacks.

    What's Next

    This simple implementation makes an assumption about the information that is in the manifest file. It assumes that the URLs for OSDs and icons in the manifest file point to the appropriate locations. In the next installment of this series, we'll see how to remove this assumption and have the code fix-up the locations before it returns them to the client.

    In future installments, we'll also look at techniques that can be used to fix up the OSDs before they are returned to the client. This can be used to enable all kinds of advanced functionality. It can be used for simple things, like fixing up the SFT URLs in the OSD files so they point to the correct HTTP URL. And, it can be used for more advanced functionality, like automatically adding the proper Dynamic Suite Composition tags to the individual OSD files for a package. This can be used to eliminate the task of manually updating all of the OSD files with DSC tags.

    Another important topic that isn't addressed in this first sample is performance. The sample code is inefficient, since it loads each of the manifest files into an XmlDocument object for every request it receives. There are many possible caching optimizations that could be added that would improve the performance of this code. We may cover some of them in future releases.

  • Application Virtualization video on channel 9 - Part 2

    A couple of weeks ago I did a follow-up interview with Charles Torre for Channel 9's Going Deep show.  This time we got to go a little deeper in the architecture.  Enjoy.

    http://channel9.msdn.com/shows/Going+Deep/John-Sheehan-Application-Virtualization-Redux-Inside-AppVirt-45/

    -John

  • Application Virtualization video on channel 9

    A couple of months ago I did an interview with Charles Torre for Channel 9's Going Deep show.  Well, it finally got posted.  Let me know what you think.  Feel free to just post comments on the Channel 9 site and I'll monitor it and answer them.

    http://channel9.msdn.com/shows/Going+Deep/John-Sheehan-Inside-Application-Virtualization/

    John

  • Compile-time Binary Search

     

    A while back I was writing a generic b-tree implementation that used a memory-mapped file for persisting / retrieving data.  I wanted it to be used in the same way that std::map is, like this:

     

          btree<key_t, value_t> b_tree;

     

    There are a bunch of other implementation details around specifying the file path for the backing store, etc.  But, those aren’t particularly important.  The challenge I faced was figuring out how many elements I could store in a single node in the b-tree (if you need a refresher on b-trees, read the Wikipedia article).

    Initially, the problem seems pretty simple.  Basically, there is some bookkeeping data, e.g. parent & child references (not actually pointers, since this is an on-disk structure), that are used for maintaining the b-tree’s structure and then there is an array of elements that use up the rest of the space in the node.  It looks something like this:

    template <class key_t, class record_t, rec_count_t records_per_node>

    class node

    {

     

          // bookkeeping data

          mmap_file<>* file_;

          node_index_t this_node_;

          node_index_t parent_node_;

          node_index_t larger_node_;

          rec_count_t record_count_;

     

          // element array

          entry_t entries_[records_per_node_];

    };

     

    So, all you have to do is figure out how much room that bookkeeping data takes up, subtract it from the total size of a node and use the remaining space for the element array.  Simple, right? 

    Well, it turns out to be slightly more complicated than that.  When you declare classes in C++, the compiler is given leeway on how it lays out the individual members of the class.  It is allowed to rearrange them, add padding to properly align the items in memory, etc.  So, just knowing the sizes of the individual members does not tell you how much space they will actually take up. 

    I came up with many different ways to attempt to calculate how much free space was in the node.  However, every technique I used was in some way making assumptions about how the compiler would lay out the class. 

    I even considered using ‘#pragma pack’ to force the compiler to force the compiler not to put in any padding.  However, this was a bad idea, since this is a generic data structure and I don’t know what type of data will be stored in it.  Having a huge array of misaligned elements could be catastrophic from a performance standpoint.

    At this point, I figured there had to be a simpler way.  In reality, what I was trying to calculate was pretty simple.  I just needed my class to fit into a node and stored as many elements as possible.  All the information needed for making this decision is available at compile time. 

    That’s when it dawned on me that this was a perfect situation for template metaprogramming.  If you haven’t used this programming paradigm before, I highly recommend reading Andre Alexandrescu ‘s excellent  book, Modern C++ Design.  Once you’ve read that, you should get familiar with Boost’s MPL.  It is one of the most interesting libraries I have ever looked at.  It takes the STL concepts of containers, iterators, functors, etc. and applies them to metaprogramming.  Very interesting, indeed!  J

    OK, so I know that I want to be able to calculate at compile time how many elements can fit in a node.  But, how do I accomplish that?  Actually, it is conceptually pretty simple.  The sizeof() operator can be used at compile time to calculate the size of a particular type.  So, all I need to do is find the largest number of elements where sizeof(node<key_t, record_t, records_per_node>) is greater than or equal to the maximum node size.  The pseudo-code looks something like this:

    // keep incrementing until it is just a bit too big

    num_recs = 0;

    while(sizeof(node<key_t,record_t,++num_recs>)<=max_size);

     

    // the previous one was the right size

    --num_recs;

     

    Now, in template metaprogramming, there are no loops.  So, just like in functional languages, we have to use recursion, instead. 

    I implemented the above algorithm using compile-time recursion.  When I did, I found two things:

    1.       It was slow.  It turns out that the minimum page size for memory-mapped files is 64KB.  That means that, for small record sizes, thousands of records fit in a single node.  For these cases, I noticed a considerable delay during compilation while this calculation was occurring.

    2.       It caused the complier to crash with a stack overflow!  As you learn when you start doing template metaprogramming, you need to be careful about how much recursion you do, since the compiler’s stack is limited.

    There are lots of techniques for reducing the amount of recursion used during compilation.  In my case, I decided to change my algorithm.  Instead of doing a linear scan of sizes, I decided to implement a binary search.  I will not go into the specifics of implementing a binary search, since most people should be familiar with it. 

    The resulting code was somewhat difficult to read.  But, when writing a generic class that you expect to be used over-and-over again, you often sacrifice beauty for efficiency.  Overall, I think the code makes a lot of sense.  The only real confusing part of it is the use of eval_if_c<> to prevent the compiler from going into an infinite recursion.  Hopefully, with a little bit of inspection, the technique I used becomes apparent.  If anyone has any questions about how it works, please post questions and I will answer them.

    To use this class (see below), I just do the following:

    const int num_recs = node_max_sizer<
    key_t,record_t,max_size >::records_per_node_;

     

    num_recs now contains the maximum number of elements that can fit in a node.  Here is the entire class:

    #include <boost/mpl/identity.hpp>

    #include <boost/mpl/comparison.hpp>

     

    using namespace boost::mpl;

     

    // performs a binary search at compile time to find the largest number of records that fit

    template<   class key_t,

    class record_t,

    int max_size,

    int records_per_node = max_size/2,

    int step = max_size/2,

    bool found = false>

    struct node_max_sizer

    {

          typedef impl_::node<key_t,record_t,records_per_node> this_t;

     

          // cut the step size in half each time (can't be smaller than 1)

          static const int step_ =

    if_c<step<=1, int_<1>, int_<step/2> >::type::value;

     

          // calc the size of this and the next block

          static const int size_ = sizeof(this_t);

          static const int next_size_ =

    sizeof(impl_::node<key_t,record_t,records_per_node+1>);

     

          // if we are at the largest size that fits, we have found what we're looking for

          static const bool found_ = (size_<=max_size) && (next_size_>max_size);

     

          //

          typedef typename eval_if_c<found_,

                            this_t,

                            eval_if_c<(size_<max_size),

                                  typename identity<node_max_sizer<key_t,record_t,max_size,records_per_node+step_, step_, found_> >::type ,

                                  typename identity<node_max_sizer<key_t,record_t,max_size,records_per_node-step_, step_, found_> >::type

                                  >

                            > type;

     

          static const int records_per_node_ = type::records_per_node_;

    };

     

  • When idle threads bugcheck

    Or, when APCs attack

    REQUIRED DISCLAIMER:  I’m the architect for the Microsoft Application Virtualization product (formerly known as SoftGrid).  However, all information, opinions, anecdotes, facts, pseudo-facts, exaggerations and flat-out lies represent my personal view of the world and not those necessarily held by my team or Microsoft.  You’ve been warned.

    The Situation

    Several weeks ago, we were trying to ship the Microsoft Application Virtualization Beta when we encountered a very strange bugcheck in our File System.  It was one we had never seen before and the initial investigation didn’t look very promising.  Now, as anyone in software knows all too well, software products love to wait for these special moments to decide to blow up in new and interesting ways.  I personally believe that, just like disobedient children, they hate their parents and want to make them miserable.  Unfortunately, you can’t put your product in time-out.  And, just as with children, beatings don’t really seem to accomplish much. J

    Initial Analysis

    When investigating an issue like this, you first start by trying to figure out what action caused the system to bugcheck.  You can then use this information to help reconstruct the situation that lead to the crash.  Normally, the first place you start is with the ‘!analyze –v’ command.  It does a basic analysis of the dump file and gives you its best guess at what caused the crash.  In this particular case, !analyze pointed its finger at our File System driver.  But, not much else made sense.  Here is the stack that was output:

    ChildEBP RetAddr 

    a9750c80 806ffa16 nt!KiTrap0E+0x238

    a9750d5c a9338ae2 hal!KeAcquireInStackQueuedSpinLock+0x26

    a9750d6c a9339c79 sftfsXP!SoftFSProcessRequest+0x6c

    a9750dac 80574128 sftfsXP!SoftFSQueueSyncWorker+0x39

    a9750ddc 804ec781 nt!PspSystemThreadStartup+0x34

    00000000 00000000 nt!KiThreadStartup+0x16

     

    In this particular situation, the stack doesn’t seem to make a whole lot of sense.  It looks as if our File System is calling nt!KeAcquireInStackQueuedSpinLock.  Now, sometimes functions in the DDK are really macros for some internal function that you’ve never heard of.  But, when we looked through the code in this function, this did not seem to be the case.  This was a very simple dispatch function that did very little.  So, even though this was a release build and, therefore, was optimized, it still didn’t seem likely that the code could be calling nt!KeAcquireInStackQueuedSpinLock.  Could we be facing stack corruption?  Only time would tell.

    NOTE:  At this point, lots of analysis of the dump file occurred.  Since none of it pointed to the culprit, I will omit the details to prevent this article from turning into a novel (which it is already in danger of becoming).  I will also omit lots of other details that were uncovered during this long debugging session.

    At this point, we figured we would try to catch the corruption in the act, so that we would have a live system to debug.  Here is what we learned:

    1.       It only happened on a single test system, which happened to be running Windows XP

    2.       It did not happen with a checked build of the product

    3.       It did not happen if you had kernel mode debugging enabled

    4.       It did not happen on an identical hardware setup that was built using the same OS image

    5.       It only happened during the first launch of a package

    6.       If you rebooted after install, it did not occur

    So, we’ve hit software’s equivalent of the Observer effect.  Basically, the Observer effect states that the act of observing the phenomenon changes the phenomenon being observed.  This usually points to some kind of timing issue that only occurs when the system is running at full speed.  Given all of the other caveats around the issue, this certainly appears to be a timing issue.  The fact that it happens on one system and not another (practically identical) system points to some kind of environmental issue.  Unfortunately, it is impossible to tell at this point whether it is a hardware or software difference.  In my experience, it is almost always a software difference when you are dealing with identically configured system (software just varies so much more readily than hardware).  So, it looks like we’ll have to continue our investigation using the crash dump that we started with.

    At this point, reports started coming in from some of our TAP customers who had started playing with a Beta build.  They were seeing the same issue.  In every case, it happened on XP.  Vista did not seem to be vulnerable to this particular race.

    Debugging the Issue

    Even though the stack didn’t make complete sense, it did give up some clues as to the scenario that was causing the bugcheck.  The thread in question was a kernel thread that was part of a pool of worker threads that our File System uses to perform any deferred work that it needs to accomplish.  This actually complicates things, since this one thread could have been performing one of over a dozen of different operations.  However, all was not lost.  We knew that this bugcheck occurred during the very first launch of a virtual application.  We also knew that this worker pool was primarily used during initialization / shutdown of virtual application volumes.  Since these requests all come from our NT service (named sftlist.exe), this is a good place to look for the initial request that this thread was processing.  After all, it isn’t likely that this thread decided to do work all on its own.  A quick scan of the threads in our NT service found the following interesting stack (using the command ‘!process 0 F sftlist.exe’):

            ChildEBP RetAddr 

            a9fe5904 804e1bd2 nt!KiSwapContext+0x2f

            a9fe5910 804e1c1e nt!KiSwapThread+0x8a

            a9fe5938 a9339042 nt!KeWaitForSingleObject+0x1c2

            a9fe5964 a933985e sftfsXP!SoftFSInsertQueueItem+0x12a

            a9fe5984 a93149bb sftfsXP!SoftFSVolMngrGlobalPackageInitRequest+0x88

            a9fe59e8 a9340b0b sftfsXP!SoftFSConnectSession+0x33d

     

    Here, a thread has just posted a job to our thread pool using sftfsXP!SoftFSInsertQueueItem and is awaiting the response.  Now, this does not absolutely prove that the thread in question was performing this task.  But, it does strongly point in that direction.  Ideally, we would be able to correlate the request objects in both threads.  But, in this case, the other thread’s stack was not intact enough to perform this analysis. 

    Unfortunately, this thread pool also does some periodic work that could be the culprit.  (We actually went down that path, at one point.)  But, given the strong correlation between the two events, we proceeded with the assumption that these were related.  This is one of the techniques you need to use when doing post-mortem debugging.  Sometimes you just have to create a hypothesis and then follow it to its logical conclusion.  The key to being a good debugger is to come up with solid hypotheses that can be proven / disproven quickly.  If you can figure out you’re wrong quickly, you can move on to another hypothesis.  Rinse…repeat.

    If we look at all of the work that this type of request does, it is a little disheartening.  This is the most complicated request that this worker thread can perform.  The code path represents several thousand lines (if you follow all of the different potential branches).  So, code review does not seem to be the quickest route to finding the issue.  Still, it is good to at least familiarize yourself with the code to help inform the various hypotheses you will come up with during the debugging process.  This particular code did the following operations (this is an incomplete list):

    1.       Creates the global package volume files, if they do not already exist (there are two)

    2.       Ensures there is space in the cache for the package, pre-allocating if necessary

    3.       Registers the global package volume in the file system cache structures

    4.       Create the temporary working versions of the volumes, copying the existing ones, if necessary

    5.       Initialize the volumes with package data, if they have not yet been initialized

    Wow.  So, there is a lot going on in this one request.  Each of these operations represents many sub-operations, any one of which could be the culprit of the bugcheck.

    Logging to the Rescue

    Since we have a system where we can reproduce the issue, we decided to add some logging to see what requests the worker pool was handling during the reproduction run.  Choosing what information to log is the key to success.  You need to log enough information to further guide your debugging.  However, if you log too much, you can change the timing of the system sufficiently to prevent the reproduction from being success (another form of the Observer effect).  For this scenario, we decided to log each job that was submitted to the worker pool, including:

    1.       The type of request being handled

    2.       Its basic parameters

    3.       Which thread it was running on

    4.       The return code of the operation (after it was done being handled)

    We also logged some basic information about each thread state change, such as when it went to sleep to wait for new requests, when it was woken up to do some work, etc.  One very appealing aspect of this approach is that it should tell us which type of request was in process when the bugcheck occurred.

    With the logging in place, we reproduced the issue.  To our utter amazement, there were no requests in process when the bugcheck occurred!  At first, we didn’t believe it.  Since we use WMI logging in our driver, we assumed that the log messages for the running thread were just in the memory buffer and hadn’t been written to disk yet.  To check this, we used the ‘!wmitrace.logdump’ command to dump the contents of the in-memory buffer.  To our surprise, there were no other logging events in our buffer.  All of the requests appear to have been complete.  If fact, the thread that bugchecked logged a message a full minute before the crash stating that it was going to wait on the job queue for the next job.  The next line of code after the log message was:

    KeWaitForSingleObject(queueEvent, Executive, KernelMode, FALSE, 0);

    The line after this line was another log message stating that the thread had woken up.  So, to our disbelief, the thread in question had gone to sleep on an event and, a minute after sleeping, had bugchecked without ever waking up.  To double-check, we verified that no jobs had been put in the queue since this last job succeeded.  Sure enough, this was the last job put in the queue.  So, this thread really did bugcheck a minute after going to sleep without ever being woken up.  What was going on here?!!!

    Making Sense of the Results

    The only way that a sleeping thread can execute code without being woken up is if an Asynchronous Procedure Call (APC) is posted to it.  However, if you look at the parameters that were passed to the KeWaitForSingleObject call, you will see that the Alertable parameter was set to FALSE.  So, we shouldn’t be woken up, right?  We’ll soon see.

    If we go back to the original stack that we saw at the beginning of the investigation, we can try to gather some more information.  Debuggers (e.g. kd, windbg) do their best to reconstruct the thread’s stack based on what information is available.  Since they are working with imperfect information, they don’t always get it right.  If you are concerned that the stack being displayed is not correct, you can attempt to walk it yourself to gather more information.  A great command for this is the ‘kd’ command.  ‘Kd’ outputs the raw stack with each DWORD on a separate line.  For each line, it attempts to locate any symbol information for the location, effectively doing an ‘ln’ on each DWORD.  This is a great way to get detailed information about the stack.

    Here is the output of ‘kd’ for the dump in question.  I have removed all of the lines that did not resolve to a particular symbol to make it easier to read.

    a9750c84  806ffa16 hal!KeAcquireInStackQueuedSpinLock+0x26

    a9750c90  804f2bb5 nt!IopCompleteRequest+0x32f

    a9750cac  a9330d77 sftfsXP!SoftFSReleaseSoftVolume+0x33

    a9750cd0  804e0030 nt!VdmFixEspEbp+0x60

    a9750ce8  806ffa16 hal!KeAcquireInStackQueuedSpinLock+0x26

    a9750cf4  804f2dc4 nt!KiDeliverApc+0xc2

    a9750cfc  806ff410 hal!KfLowerIrql

    a9750d18  804f2a72 nt!IopCompleteRequest

    a9750d30  804f2df4 nt!KiSwapThread+0xa8

    a9750d48  804e1c1e nt!KeWaitForSingleObject+0x1c2

    a9750d60  a9338ae2 sftfsXP!SoftFSProcessRequest+0x6c

    a9750d70  a9339c79 sftfsXP!SoftFSQueueSyncWorker+0x39

    a9750da0  a9346b72 sftfsXP!except_handler3

    a9750da4  a9373e70 sftfsXP!__safe_se_handler_table+0x910

    a9750db0  80574128 nt!PspSystemThreadStartup+0x34

     

    As we can see from this listing, it looks like an APC is being delivered to our thread, which is currently sleeping in nt!KeWaitForSingleObject.  This lines-up with the rest of the data we have gathered so far.  So, the next question is:  where is this APC coming from?

    Finding the APC

    From the stack, it appears that a Completion Request is being delivered to this thread.  That’s strange.  All IRPs that we send to other drivers during processing, such as to read / write a volume file, are done by the same routine, which waits for IRPs to be completed before continuing processing.  So, it doesn’t seem like we should be getting any APCs on this thread. 

    Another interesting data point is that this started happening fairly recently (on the order of months).  We would like to be able to do a binary search of mainline builds to determine which build introduced the issue.  Unfortunately, there were other issues blocking this test during the period of time that this issue was introduced.  So, we cannot determine which build introduced the issue.  However, we knew the changes made to our File System during this time.  This vastly reduced the amount of code that would need to be reviewed.  Therefore, we went back and looked at the changes.  Most of them seemed pretty straightforward.  However, there was one change that looked a little different from the rest.

    Tracking Down the Culprit

    As I mentioned previously, our File System creates an IRP when it wants to read / write from a volume file on disk.  This common code has been around for years and seems to work fine.  It ensures that the request is complete before it returns to the caller.  While doing our targeted code review, we found one place where this common function was not being used.  In this one case, ZwWriteFile was being used to write to the file.  This function was writing a single NUL byte past the current end of the file in order to extend its length.  Now, upon first examination, nothing seemed strange.  Here is the basic signature of the call that was made:

    IO_STATUS_BLOCK iosb;

    status = ZwWriteFile(file, NULL, NULL, NULL, &iosb, buffer, length, offset,0);

    if (NT_ERROR(status)) …// handle error

     

    This particular call is setting all of the APC parameters to NULL, since it does not want to handle this call asynchronously.  The code expects that the call will complete before returning.  (This assumption is the heart of the problem.)  Unfortunately, the code used the NT_ERROR macro to determine if the ZwWriteFile call succeeded.  Although not completely clear in the documentation, it turns out that ZwWriteFile can return STATUS_PENDING if the call has not yet completed.  Since STATUS_PENDING is a success code, the caller missed the fact that the call hadn’t completed yet.  Another issue with this code is the return code of ZwWriteFile is not the return code of the file operation.  To determine if a file operation succeeded, you need to check the Status member of the IO_STATUS_BLOCK structure.  Even if the code had detected that the request was still pending, there is no way for it to determine when it is done, since it did not pass in an Event that the underlying driver could signal when it was done.

    I found it a little strange that the FS was willing to pend the request, even though there was no way for it to communicate the completion of the request back to the caller (since all parameters that could be used for this purpose were NULL).  But, I guess users of the NTAPI are supposed to know what they’re doing.  In particular, the file handle being used to make this write call was not opened for synchronous operations (i.e. none of the FILE_SYNCHRONOUS_IO_* flags were set in the ZwCreateFile call).  Therefore, the user of this handle had specifically stated that they were willing to handle asynchronous operations.

    There is one other piece of the puzzle that we need to understand.  What was the completion routine doing that was causing the bugcheck?  The answer is right in the code that calls ZwWriteFile.  Can you spot it? 

    Final Cause Analysis

    In the final analysis, it turns out that we were running into a form of stack corruption, albeit not the one we originally thought we were.  The IO_STATUS_BLOCK that is handed into ZwWriteFile is used by the underlying file system driver to save the status of the write operation.  The IO_STATUS_BLOCK handed in by the code above was on the stack.  Once this function returns, this stack location is no longer valid.  The stack location will eventually get reused by the next function call that extends this far down the stack.  It turns out that the IoCompletion routine of the file system driver was writing its final status onto the stack at a location that was now being used for something else (in this case, an In-Stack Queued SpinLock).  This corrupted the stack and caused the bugcheck.  Once you figure out what is happening, it makes so much sense.

    Conclusion

    Well, at least we now know what is going on.  While implementing a new feature that pre-allocated cache space for a virtual application, we triggered a situation where, under certain conditions, the file extension request could not be finished synchronously and was being pended.  The code in question was not ready to handle this situation.  So, it ended up assuming that the operation was complete and went back to sleep.  When the IoCompletion routine was later triggered for the request, it attempted to write the status of the write operation onto a stack location that was no longer valid, causing stack corruption.

    The solution to the fix was easy.  All we had to do was call the same write function that we use everywhere else in our code.  This was sufficient to prevent the situation from occurring.  The one nagging issue was why we didn’t see this issue on other test systems and why customers were able to catch it so quickly. 

    After a little more digging, we believe we discovered why the two almost-identical systems had such different behavior.  It turns out that the system that experienced the issue had a hard drive that was highly fragmented.  The system that could not reproduce the issue had very little fragmentation.  So, our hypothesis is that NTFS was able to complete the call synchronously if it could find a single range to handle the extension.  If the drive was highly fragmented, the request would take a long time and would end up getting pended.  This may also explain why we did not see this on Vista machines, since Vista automatically defrags the hard drive on a regular basis.

    Lessons Learned

    I think the biggest lesson to be gained from this experience is that, when using the NTAPI to write to files in kernel mode, you either need to specify one of the FILE_SYNCHRONOUS_IO_* flags when creating / opening the file or you need to pass in an Event object and handle the case where file operations return STATUS_PENDING.  Doing neither of these leads to disaster (and long nights).

  • A Very PowerShell Thanksgiving

    I had the week of Thanksgiving off.  So, what better time to learn a new technology.  And, since I spend more time in the command prompt than anywhere else on my system, I figured I should get with the times and learn PowerShell.  So, I grabbed Bruce Payette's 'Windows PowerShell in Action' and got to work. 

     

    Let me first say that this is a really well written book.  Bruce explains not only the what, but also the why.  I find it easier to understand a technology when I understand why the decisions were made by the team in the first place.  It didn't take long before I was up and running in a PowerShell prompt.  It turns out that this was the perfect technology to learn on my vacation.  I really didn't want to spend much time on it, as I have been neglecting my family trying to get our Beta shipped and traveling.  PowerShell and Bruce's book were the perfect combination for learning something in short bursts.  I was able to play with my daughter, spend time with my wife and fit in a little learning in the odd moments I had to myself.

     

    I'm very impressed with PowerShell.  It is a simple, logical language and shell environment.  My test for a new language is whether it is cohesive.  In other words, does it make sense.  If it makes sense, I won't have to keep referring to documentation to get stuff done.  PowerShell makes sense.  The commands are small, simple and consistent.  The PowerShell team did a great job.

     

    Whenever I pick up a new language, I always choose a project that I want to do and see how well I can do it with the language in question.  For example, when I learned Ruby, I decided to create a Sudoku solver.  Ruby was so easy to learn and consistent that I finished the solver in two hours after reading only part of a book.  That is still a record for me.

     

    For PowerShell, I decided to choose a project that fell into the systems management area.  Rather than doing some boring file processing project, I figured I would try to replicate a utility I had previously written in C++.  Our Test team has some automated tests that exercise Microsoft Office on a Terminal Server to test the scalability overhead of SoftGrid (ahem, Microsoft Application Virtualization).  Occasionally, they run into an issue where an Office process will just start chewing up 100% cpu on a processor.  It is unclear what is going on with Office, but since it happens without SoftGrid, it isn't something that they worry about.  The one problem is that this issue can skew their results.  So, they were constantly monitoring their test rigs so they could kill any processes that had "run away".  Since I figured these guys had better things to do with their time, I wrote them a small utility that would monitor processes and kill any that had "run away".  It allowed them to configure the parameters that defined what it meant to run away.  The utility worked out great. It saved them from having to manually monitor these systems and made their results much more predictable.

     

    This sounded like the perfect project for learning PowerShell.  It was a pretty narrow problem.  And, I figured PowerShell would be good at solving it.  Boy, was I right.  After reading half of Bruce's book (I have to admit, I still haven't completely finished it, my daughter took precedence J) I was able to solve the problem quickly and easily.  Now, I’m not a PowerShell Ninja yet.  But, I was able to code together a reasonable solution in about 40 minutes with 20+ lines of PowerShell code.  This included time for getting the syntax right and figuring out one weird issue with the Process object.

     

    To solve this particular problem, I just created a hashtable of Process objects retrieved from the Get-Process command, indexed by their process id.  Then, I took another snapshot of processes after a configurable period and calculate how much CPU they had used.  Simple.  But, when I first tried this, it didn’t work.  The reason is that the CPU property on the Process object is dynamic.  It is not a snapshot in time.  Each time you query it, it gives you the latest amount of CPU used for the process in question.  So, I had to store the current value somewhere.  Luckily, PowerShell allows you to attach properties (it calls them ‘notes’) to an existing object.  This made my life much easier, since it meant I didn’t have to invent a new object to store the Process object and my custom property.  Also, since I wanted to have multiple samples before I killed a process, I needed to keep track of how long the process had been “in violation” of the policies handed into the script.  Again, PowerShell’s notes came to the rescue.  I could store the number of times a process was in violation and only kill it after it had violated the threshold number of cycles in violation.

     

    In the end, I added a couple of other small features to the script.  For one, I decided to make it configurable whether it would kill the process or just lower its priority to Idle.  This seemed more useful for regular situations where you might not want to kill the process, but still want your system to be responsive.  I also added an exclusion for processes running in Session 0.  This made sense, since only Services and System processes should be running in Session 0 (at least, on Vista).

     

    Here is the output of the tool, with an example process that was taking up too much CPU (given the name of it, no wonder).  One thing you'll note is the initial error message that I get when I launch PowerShell.  Every time I launch PowerShell, I get the following error:  "Attempting to perform the InitializeDefaultDrives operation on the 'FileSystem' provider failed."  It turns out that this error occurs when PowerShell attempts to access a drive which it doesn't have permission to.  Now, this is a very atypical case.  In my case, it is because I have Microsoft Application Virtualization installed and the drive is not accessible from this particular PowerShell process.  So, I get the error.  PowerShell is still functional.  And, I have a feeling it will be fixed in the future.

     

    PowerShell Screenshot 

     

    Anyway, below is the script I finally came up with.  Now, as I said, I still haven’t reach Ninja status in PowerShell.  I’m sure that it looks too much like a C++ code and there are probably tons of things that a seasoned PowerShell programmer would do differently.  Still, it solved my problem quickly and efficiently.  Isn’t that the point of scripting languages? J  I considered commenting the code to help you understand what is going on.  But, I decided to leave it as is.  In its own way, I think it is beautifully simple.  It is surely much simpler than the C++ code that it replace.  Happy PowerShelling.

     

     

    param([int] $period=5, [int] $maxCpu=$period*0.9, [int] $threshold=5, [bool] $kill=$false)

     

    $procs = @{}

     

    while($true) {

     

           # Create hash table of processes, hashed by process id

           $new_procs = @{}

           foreach ($proc in Get-Process) {

     

                  add-member -in $proc noteproperty periodsAboveMaxCpu 0

                  add-member -in $proc noteproperty cpuUsageSnapshot $proc.cpu

     

                  $old_proc = $procs[$proc.id]

                  if($old_proc -ne $null) {

                         $proc.periodsAboveMaxCpu = $old_proc.periodsAboveMaxCpu

     

                         if(($proc.cpuUsageSnapshot - $old_proc.cpuUsageSnapshot) -gt $maxCpu){

                               $proc.periodsAboveMaxCpu++

                         } else {

                               $proc.periodsAboveMaxCpu = 0

                         }

                  }

     

                  $new_procs[$proc.id] = $proc

           }

     

           $procs = $new_procs

          

           $bad_procs = $procs.keys | foreach{$procs[$_]} | where {$_.periodsAboveMaxCpu -gt 0}

           $bad_procs

          

           foreach ($proc in $bad_procs) {

                  if(($proc.periodsAboveMaxCpu -gt $threshold) -and ($proc.SessionId -gt 0)) {

     

                         if($kill) {

                               write-output "Killing $proc"

                               $proc.Kill()

                         } else {

                               write-output "Dropping priority of $proc to idle"

                               $proc.PriorityClass = [System.Diagnostics.ProcessPriorityClass]::Idle

                         }

                  }

           }

     

           sleep($period)

    }


© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker