Colin Thomsen's Microsoft Blog

I'm a developer working on the code profiler that ships with Visual Studio 2010 Premium and Ultimate editions. At a previous company I worked on computer vision software for face and gaze tracking.

Posts
  • Colin Thomsen's Microsoft Blog

    Tip: VS2008 - Understanding Performance Targets

    • 1 Comments
    default_wizard_output_slnexplorer

    If you have a solution that contains multiple projects it is important to know what the 'Targets' group in the Performance Explorer is used for. The PeopleTrax solution shown on the right has 4 projects, with 3 of them compiling to managed DLLs and 1 compiling to an executable.

    After running the Performance Wizard to create a Performance Session the Performance Explorer contains a single target as shown below.

    default_wizard_output_perfexplorer 

    Only the project that compiles to an executable is listed in the 'Targets' folder (for other project types like websites it would include the default launch project). What about the other 3 projects? As this tip explains, it depends upon the type of profiling you wish to do.

    Sampling

    With sampling there is no need to add the additional projects to your targets list. We do not modify assemblies when sampling and we will automatically attempt to collect data for any assemblies loaded by the PeopleTrax target. The only exception to this requirement is if you wish to collect data for multi-process scenarios and therefore need to launch multiple targets.

    Instrumentation

    For instrumentation, if you wish to collect data for the additional projects they should be added to your targets list as follows:

    1. In the Performance Explorer, right-click on the 'Targets' folder:
      add_target_project_rightclick
    2. Choose 'Add Target Project' to display a dialog:
      add_target_project_dialog 
    3. Select the assemblies you wish to collect Instrumentation data for and choose OK.

    The selected projects will now be modified (instrumented) when you start profiling. You can selectively disable instrumentation for certain projects by right-clicking on the target and unchecking the 'Instrument' option.

    targets_launchable_trace_properties_crop
    Instrumentation properties for a specific target.

  • Colin Thomsen's Microsoft Blog

    Quick Tip: VS2008 - Compare Reports Quickly

    • 2 Comments

    While investigating a performance problem you may need to collect many Performance Reports and compare them. You can use the Performance Explorer to quickly compare two reports by:

    1. Selecting two reports.
    2. Right-clicking and choosing 'Compare Performance Reports...'
      comp_reports

    The oldest report will be used for the 'Baseline' report and the other report will be used for the 'Comparison' report, as shown below:

    comp_reports_2

  • Colin Thomsen's Microsoft Blog

    Sysinternals is Live

    • 0 Comments

    I use a bunch of Sysinternals tools for diagnosing problems while developing. My two favorites are:

    • Process Explorer, a more fully-featured version of Task Manager that can report environment variables for running processes, show loaded DLLs and even display callstacks. It can also tell you which process is currently accessing a certain file or DLL, which is useful if you're trying to delete a file and getting a 'file is in use and cannot be deleted' error.
    • Process Monitor, which can record all accesses to files, disks and the registry. Very useful for diagnosing complicated scenarios with multi-process development.

    Recently the Sysinternals tools have been hosted on a new live site that can be accessed via the web, or as a file share. Now I can easily run a Sysinternals tool and be sure that it is the newest version:

    dbgview 

    I can also update my own local cache of useful tools by periodically copying from the file share.

  • Colin Thomsen's Microsoft Blog

    Tip: VS2008 – Finding and Setting Properties (Right-Click)

    • 0 Comments

    The Visual Studio Profiler has many properties and options and this tip shows you where to find most of them. Future posts may cover some of the specific properties in more detail.

    Performance Session:
    session_properties 
    Select an existing Performance Session in the Performance Explorer to see properties in the Properties Window. If the Properties Window is hidden: 
    Press ‘F4’ or go to
    ‘View->Properties Window’.
      Performance Report:
    report_properties

    Select a Performance Report in the Performance Explorer to view many properties including Collection, ETW, General, Machine Information, Performance Counters, Process, Thread and Version Information.

     

    Performance Session Properties (and Options):

    session_properties_1 To adjust Performance Session properties:
    1. Right-click on the Performance Session (Performance1 in this example).
    2. Select ‘Properties’.

    Properties for Performance1 are shown below. There are different categories of properties on the left (e.g. General, Launch, Sampling, …).

    session_properties_2

     

    Performance Targets:

    target_properties_1 To adjust Performance Target properties:
    1. Right-click on the Target (ConsoleApplication3 in this example).
    2. Select ‘Properties’.

    Adjust the properties for the Performance Target as required. These properties do not often need to be changed, with the possible exception of the Instrumentation property ‘Exclude small functions from instrumentation’.

    target_properties_2

     

    Tools –> Options –> Performance Tools:

    Some global options can be configured using the Visual Studio Options dialog, which is accessed via:

    Tools –> Options –> Performance Tools

    tools_options

    That’s all the properties I can think of but I’m probably missing some still. Probably the most important aspect to this tip is to emphasize that right-clicking with the mouse is often the way to access important contextual information.

  • Colin Thomsen's Microsoft Blog

    C# For C++ Devs: ~A() doesn't act like a destructor

    • 0 Comments

    In C++, memory allocated with the 'new' keyword must be deallocated using 'delete' or it is not deallocated until the application finishes. A call to delete results in a call to the destructor for that class. Classes that are allocated on the stack are automatically destroyed, which calls their destructor, when they go out of scope.

    Sometimes this 'deterministic' memory allocation/deallocation behavior is exploited by developers using scoped objects on the stack to acquire and then automatically release resources even in the presence of exceptions (this pattern is known as Resource Acquisition Is Initialization - RAII).

    Here is a C++ class designed to be used in RAII pattern:

    class A
    {
    public:
       A()
       {
         
    // Acquire a resource (e.g. mutex or file)
       }

       ~A()
       {
         
    // Release the resource
      
    }
    };

    The class is then used as follows:

    void f()
    {
       {
          A raii;
         
    // do some stuff, maybe even throw exceptions
       
    }
      
    // raii has gone out of scope, so the destructor has been called. If an exception was thrown A still went out of scope and the destructor was still called
    }

    C# is a language with automatic garbage collection which means that developers allocate memory but in most cases they don't need to worry about when that memory is deallocated. There is no way to explicitly call the destructor. It is called whenever the garbage collector decides it is necessary to clean up, which is called Finalizing the class. In most cases classes should not implement a destructor.

    In C#, it is possible to get somewhat deterministic garbage collection (at least for unmanaged objects like files) by implementing the IDisposable interface and adding a Dispose() method. That method acts much more like C++'s destructor than the equivalent class destructor. The dispose pattern is described pretty well for C# in the MSDN help for IDisposable.

    Things to note:

    • The C# destructor will only (and can only) be called by the Finalizer.
    • Dispose() may be called in code.
    • If Dispose() is called before the Finalizer is called, finalization is suppressed using GC.SuppressFinalize(this);.
    • You must be careful not to reference any managed objects if Dispose is called from the destructor (this is achieved in the example by using an extra Dispose() function that takes a bool parameter).
    • It isn't covered in the code, but if you have member variables that implement IDisposable, your class should also implement IDisposable.

    Working with unmanaged resources is clearly much more work than working with managed resources.

    To implement the same RAII pattern from above in C#, assuming you have set up your class A to implement IDisposable, code with the 'using' statement to ensure Dispose() is called at the end of the block as follows:

    using (A raii = new A())
    {
      
    // Do some stuff...
    }

    This is safe in the presence of exceptions in the same way that the C++ scoped class pattern was above.

  • Colin Thomsen's Microsoft Blog

    VS2010: Attaching the Profiler to a Managed Application

    • 0 Comments

    Before Visual Studio 2010, in order to attach the profiler to a managed application, certain environment variables had to be set using vsperfclrenv.cmd. An example profiling session might look like this:

    • vsperfclrenv /sampleon
    • [Start managed application from the same command window]
    • vsperfcmd /start:sample /output:myapp.vsp /attach:[pid]
    • [Close application]

    If the environment variables were not correctly set, when attempting to attach you would see this message:
    old_attach_warning

    The profiling environment for ConsoleApplication2 is not set up correctly. Use vsperfclrenv.cmd to setup environment variables. Continue anyway?

    The generated report would typically look something like the report below. The warning at the bottom of the page indicates the problem and the report itself would typically not be useful since no managed modules or functions would be resolved correctly.

    old_attach_badreport  Report with 'CLRStubOrUnknownAddress and Unknown Frame(s) and the warning ‘It appears that the file was collected without properly setting the environment variables with VSPerfCLREnv.cmd. Symbols for managed binaries may not resolve’.

    Fortunately the Common Language Runtime (CLR) team provided us with a new capability to attach to an already running managed application without setting any environment variables. For more detailed information take a look at David Broman’s post.

    Caveats:

    • We only support attach without environment variables for basic sampling. It will not work for Allocation or Object Lifetime data collection and Instrumentation attach is not possible. Concurrency (resource contention) attach is supported.
    • The new attach mechanism only works for CLR V4-based runtimes.
    • The new attach mechanism will work if your application has multiple runtimes (i.e. V2 and V4  SxS), but as noted above, you can only attach to the V4 runtime. I’ll write another post about the profiler and Side by Side (SxS).
    • The old environment-variable-based attach still works, so you can still use that if you prefer.

    The new procedure for attaching the profiler to a managed application in Visual Studio 2010 goes like this:

    • Launch your app (if it isn’t already running)
    • Attach to it, either from the command-line or from the UI.
    • When you’re finished, detach or close the app to generate a report.

    new_attach_report

    If you want to diagnose any issues with attach, the CLR V4 runtime provides diagnostic information via the Event Log (view with Event Viewer) and the profiler also displays information there:

    new_attach_eventlog

    Event Log: ‘Loading profiler. Running CLR: v4.0.21202. Using ‘Profile First’ strategy’

    There are two .NET Runtime messages regarding the attach, the first indicating that an attach was requested and the second that the attach succeeded. The VSPERF message describes which CLR is being profiled.

  • Colin Thomsen's Microsoft Blog

    VS2010: Just My Code

    • 0 Comments

    The ‘Just My Code’ feature in the profiler has a few differences to the ‘Just My Code’ feature in the debugger so this post should provide a useful introduction.

    Example Program

    Here’s a very simple program I’ll use in this post.

    using System;
    
    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                Foo();
            }
    
            private static void Foo()
            {
                double d = 0;
                for (int i = 0; i < 100000000; ++i)
                {
                    d += Math.Sqrt(i);
                }
                Console.WriteLine(d);
            }
        }
    }

     

    Why ‘Just My Code’?

    Typically when profiling you are most interested in optimizing code that you either wrote or you have control over. Sure, sometimes there will be issues in the frameworks that you are using or in other binaries, but even then you often control the calls into those frameworks.  Just My Code or JMC is intended to filter the data that is displayed in profiler reports so that more of the code you control shows up in the reports and the reports are more manageable.

    For example, the Call Tree after collecting sampling data for the simple program above, with JMC off, is shown below:
    jmc_off_calltree_1

    With the default JMC options, this reduces down to:
    jmc_on_calltree_1

    What is ‘My Code’?

    There are two conditions for code being considered ‘My Code’ by the profiler and they are both at the Module level (Module Name column in the screenshots above). In the example above, this means the checks are made against the clr.dll, mscoreee.dll, mscoreei.dll and ConsoleApplication1.exe binaries.

    Modules considered ‘My Code’:

    1. the copyright string for the module does not contain ‘Microsoft’, OR:
    2. the module name is the same as the module name generated by building any project in the currently open Solution in Visual Studio.

    How do I turn JMC on or off?

    You can temporarily toggle JMC on or off on the profiler Summary Page in the Notifications area using ‘Show All Code’ or ‘Hide All Code’ (shown in red below):
    jmc_on1

    The default setting may be configured as discussed in the following section.

    How do I configure JMC?

    Use Tools –> Options –> Performance Tools –> General and set options in the ‘Just My Code’ section:
    jmc_options

    The default has JMC on, showing one level of non-user callee functions. In the example above with JMC on, this is why we see the call to COMDouble::Sqrt(dobule) showing up in the call tree.

    It is also possible to show one-level of non-user code calling user code, which in the example above would add one level of the non-user code that calls main, as shown below:
    jmc_on_calltree_callers_1

    Why is ‘Just My Code’ only available for sampling?

    When you instrument binaries for profiling, you have already performed some level of JMC. Only binaries that you instrument and first-level calls into other binaries will show up in the instrumentation report, so JMC is not really necessary.

  • Colin Thomsen's Microsoft Blog

    VS2010: Profiler Guidance (rules) Part 1

    • 0 Comments

    The new guidance feature in the VS2010 profiler will look familiar to people who have used the static code analysis tools in previous versions. However, instead of statically analyzing your code, the profiler runs it and analyzes the results to provide guidance to fix some common performance issues.

    Probably the best way to introduce this feature is via an example. Let’s assume you have written a simple application as follows:

       1: using System;
       2: namespace TestApplication
       3: {
       4:     class Program
       5:     {
       6:         static void Main(string[] args)
       7:         {
       8:             BadStringConcat();
       9:         }
      10:  
      11:         private static void BadStringConcat()
      12:         {
      13:             string s = "Base ";
      14:             for (int i = 0; i < 10000; ++i)
      15:             {
      16:                 s += i.ToString();
      17:             }
      18:         }
      19:     }
      20: }

    If you profile this application in Sampling Mode you’ll see an Action Link called ‘View Guidance’ on the Summary Page:

    Action Links View Guidance

    Clicking on this link brings up the Error List, which is where you would also see things like compiler errors and static code analysis warnings:

    Error List String.Concat
    DA0001: System.String.Concat(.*) = 96.00; Consider using StringBuilder for string concatenations.

    As you can see there is a 1 warning, which is DA0001, warning about excessive usage of String.Concat. The number 96.00 is the percentage of inclusive samples in this function.

    Double-clicking on the warning in the Error List switches to the Function Details View. Navigating up one level of callers, we see that BadStringConcat is calling Concat (96% of Inclusive Samples) and doing some work itself (4%). The String.Concat call is not a direct call, but looking at the Function Code View you can see a ‘+=’ call on a string triggers the call.
     Function Details Concat

     

    The DA0001 rule suggests fixing the problem by changing the string concatenation to use a StringBuilder but I’ll leave that up to the reader. Instead, I’ll cover some of the other aspects of rules.

    One of the more important questions is what to do if you wish to turn off a given rule (or even all rules)? The answer is to open up the ‘Tools/Options’ dialog and in the Performance section, navigate to the new ‘Rules’ subsection:

    rules_options

    Here you can see that I’ve started searching by typing ‘st’ in the search box at the top. This dialog can be used to turn off rules (by clicking on the checkboxes on the left), or to change rule categories to ‘Warning’, ‘Information’ or ‘Error’. The only affect is to change how the rule is displayed in the Error List.

    If you have a situation where you are sharing a profiler report (VSP file) with team members, sometimes it might be useful to let them know that a warning is not important or has already been considered. In this case you can right-click on the Error List and choose ‘Suppress Message’.

    errorlist_suppress

    The rule warning is crossed out and you can choose to save the VSP file so that the next time it is opened, the suppression is shown:

    errorlist_suppressed

     

    That’s it for now. I plan on covering a little more about rules in a future post, including more details about the rules themselves, how you can tweak thresholds and even write your own rules.

  • Colin Thomsen's Microsoft Blog

    Basic Profiler Scenarios

    • 1 Comments

    This post was going to cover some basic scenarios discussing the differences between sampling and instrumentation and when you would choose to switch methods, but then I found there is already something like that in MSDN. If you haven't already, go and take a look. See if you can improve the performance of the PeopleTrax app.

    Instead I'll discuss sampling and instrumentation from a user's perspective. There are already many definitions of sampling vs instrumentation so I won't repeat them.

    For some background reading on the sampling aspect, take a look at David Gray's post. There are a few things that he hasn't covered in that post. The main question I had was should I use sampling or instrumentation?

    A generic answer to that would be:

    • If you know your performance problem is CPU-related (i.e. you see the CPU is running at or near 100% in task manager) then you should probably start with sampling.
    • If you suspect your problem may be related to resource contention (e.g. locks, network, disk etc), instrumentation would be a better starting point.

    Sometimes you may not be sure what type of performance issue you are facing or you may be trying to resolve several types of issues. Read on for more details.

    Sampling

    Why use sampling instead of instrumentation?

    Sampling is lighter weight than instrumentation (see below for reasons why instrumentation is more resouce intensive) and you don't need to change your executable/binaries to use sampling.

    What events do you sample with?

    By default the profiler samples with clock cycles. This should be familiar to most users because they relate to the commonly quoted frequency of the machine. For example, 1 GHz is 1 billion clock cycles / second. If you use the default profiler setting for clock cycles that would mean 100 samples every second on a 1 GHz machine.

    Alternatively, you could choose to sample using Page Faults, which might occur frequently if you are allocating/deallocating memory a lot. You could also choose to profile using system calls or some lower level counter.

    How many samples is enough to accurately represent my program profile?

    This is not a simple question to answer. By default we only sample every 10000000 clock cycles, which might seem like a long time between samples. In that time, your problematic code might block waiting on a lock or some other construct and the thread it is running in might be pre-empted allowing another thread to run. When the next sample is taken the other thread could still be running which means the problematic code is not included in the sample.

    The risk of missing the key data is something that is inherent in any sample-based data collection. In statistics the approach is to minimize the risk of missing key information by making the number of samples large enough relative to the general population. For example, if you have a demographic that includes 10000 people, taking only 1 sample is unlikely to be representative. Taking a sample of 1000 people might be considered representative. There are more links about this on Wikipedia.

    Won't this slow down my app?

    No, not really. When a sample is taken the current thread is suspended (other application threads continue to run) so that the current call stack can be collected. When the stack walk is finished, execution returns to the application thread. Sampling should have a limited effect on most applications.

    Sounds good, why use instrumentation?

    See below.

    Instrumentation

    Why use instrumentation?

    As discussed above, sampling doesn't always give you the whole picture. If you really want to know what is going on with a program the most complete way is to keep track of every single call to every function.

    How does instrumentation work (briefly)?

    Unlike sampling, with instrumentation the profiler changes the binary by inserting special pieces of code called probes at the start and end of each function. This process is called 'instrumenting the binary' and it works by taking a binary (dll or exe) along with its PDB and making a new 'instrumented binary'. By comparing a counter at the end of the function with the start, it is easy to determine how long a function took to execute.

    What if I call other people's code?

    Usually you don't have access to the PDB files for other people's code which means you can't instrument it. Fortunately as part of the instrumentation process the profiler inserts special probes around each call to an external function so that you can track these calls (although not any functions that they might call).

    Why not just use Instrumentation all the time?

    Computers execute a lot of instructions in 10000000 clock cycles, so using instrumentation can generate a LOT of data compared with sampling. The process of calling the probe functions in an application thread can also degrade performance more than sampling would.

  • Colin Thomsen's Microsoft Blog

    Developer Dogfooding at Microsoft

    • 1 Comments

    I hadn't heard the term dogfooding used much before I started here, but it has already been explained so take a look here. The basic idea is that if you're not happy using your product (i.e. eating your own dogfood) then why should you expect your customers to be? Working at Microsoft gives you incredible scope to dogfood a wide variety of products.

    As a Microsoft employee, I should be using Internet Explorer, Vista, Office, etc etc and I am. This doesn't necessarily mean I shouldn't run alternative products as well or when a Microsoft product doesn't provide the functionality I need. 

    As a Microsoft developer, I should be using Team Foundation Server for bug tracking and source control. I should be developing Visual Studio using Visual Studio. I should be profiling my code using VSTS profiling tools. Fortunately, I am, although not exclusively and probably not in some other parts of the company.

    The main reason I think this is a good idea is because we get to feel any of the pain that customers do. We have extra incentive to fix any problems instead of ignoring them. We often catch problems early on before customers even see them.

    I'll admit it, the process can be painful. The pain typically increases as you get closer to the bleeding edge of technology. For example, my Visual Studio dogfooding experience involves running the latest build of VSTS while developing. There are issues which delay my development, but facing these issues every day helps me drive improvements to the product. Imagine if your source control system went down - you'd want it fixed pretty quickly and that's just what we want from our TFS dogfood server.

    Here's a few of things that I think need to happen for successful dogfooding:

    • The process must not be voluntary. As an individual dev I must use a pre-release version of TFS. As a Microsoft employee my computer is automatically updated to use the latest updates before they are pushed out to customers. There isn't a choice.
    • There must be a feedback mechanism. If things are broken it must be easy to report this and critical breaks must be fixed quickly.
    • Things must actually get better. Limit the audience for really unstable dogfooding. For example, we don't make devs outside the VS team build their own VS from last night's source. They get a 'Last Known Good' build of a release that has had extra testing carried out on it.

    If you're an application developer, are you using your own alpha/beta software before it is released to the public?

Page 2 of 4 (38 items) 1234