PerfTips: Performance Information at-a-glance while Debugging with Visual Studio

PerfTips: Performance Information at-a-glance while Debugging with Visual Studio

Rate This
  • Comments 15

You are probably all familiar with the debugger DataTips that let you see the value of a variable when you hover over it in the editor. This post introduces the new PerfTips feature in Visual Studio for C#/VB, C++, and JavaScript developers on all Windows platforms. We have just released an early preview of this new feature that you can try out today by downloading the latest CTP of Visual Studio “14”. If you do that, be sure to give us your feedback by signing up to receive a short PerfTips survey.

Why PerfTips?

We have been releasing new and improved performance tools in the Performance and Diagnostics hub in Visual Studio 2013 that help you diagnose performance issues in your app. However, for many of you, the majority of your time is dedicated to getting the app to function correctly (as opposed to focusing on the performance of your app). You would like to understand the performance of the code you write, but you might not have the time to learn and run the performance tools until performance becomes a big problem (which is sometimes too late). Furthermore, when you do want to investigate performance problems, you often just need a simple stopwatch measurement of how long it takes to execute isolated sections of code rather than the detailed top-down analysis provided by the hub tools.

We know that instead of running profiling tools, many developers do one of the following things:

  1. Insert code into their app (such as System.Diagnostics.Stopwatch) to measure how long it takes to run between various points in code, iteratively adding more stopwatches as needed to narrow down the hot path
  2. Step through code to see if any particular step “feels slow”
  3. Hit the Break All (“pause”) button at random points to get a feel of how far execution has progressed through their code, e.g. a loop. In some circles this is referred to as “poor man’s sampling”
  4. Over-optimize code without measuring performance at all, sometimes by applying a set of performance best practices (e.g. don’t use LINQ) across an entire code base

These practices are typically not very accurate, not a good use of time, or both. That’s why we introduced PerfTips: to help you to understand the performance of your app during normal debugging. Anyone that uses the debugger can use PerfTips to do a quick measurement on any section of code and find unexpected performance issues simply by setting breakpoints and/or by stepping through code with the debugger. With PerfTips you don’t need to run a separate set of tools or insert stopwatches into your code to take a simple performance measurement.

What are PerfTips?

As you step over lines of code or run from breakpoint to breakpoint, the debugger will display a PerfTip (tooltips with performance information) in the editor at the end of the code line indicating how long the program was running during the previous step or since the last breakpoint. All you need to do is look at it!

2_AfterStep1

More specifically, any time that you are stopped on a line of code in the debugger, the debugger will show a PerfTip on that line of code if there are any functions on the call stack that were also on the call stack when the program was last started or resumed. For example, if you are stopped in Button1_Click(), press F5 and hit a breakpoint in a child function called by Button1_Click(), you will see a PerfTip.  However if you are stopped in Button1_Click(), press F5, then click Button2 and hit a breakpoint in Button2_Click() (which is not on the same call stack as Button1_Click) you will not see a PerfTip.

You can even hover your mouse over the PerfTip to see the CPU time of the app while the program was running. Notice how there are two numbers in the tooltip, the elapsed time that you see on the PerfTip itself, plus the CPU time.

NeedsMouse

In most cases the CPU time will be less than the elapsed time because the execution measured will include non-CPU intensive activity such as I/O. The CPU time can also be greater than the elapsed time if multiple threads and cores are being used concurrently by your app. This is because CPU time is counted as the total amount of time each individual core on the CPU was executing code.

Notice in the screenshots above, that the PerfTip is shown after the line of code that will be executed next. We decided that, while potentially confusing at first, this was the best place for PerfTips so that the location would be the same for both the single line and the multiple line cases (as shown in the next section).

You can customize PerfTips by going to Tools -> Options -> Debugging -> Performance Tools, or you can go directly to that options page by clicking on the PerfTip and selecting “Performance Tips Options…”.

4_Click

5_Options

From here you can set options so that PerfTips only show up when the values exceed a certain threshold, and you can also enable the CPU time PerfTip to see the CPU time at-a-glance without having to hover on the PerfTip – see the following screenshot of what that looks like.

6_CPUTimeStep

Keep on reading to see how to use PerfTips in action!

PerfTips in Action

To show you an example of how it works, we will take a look at improving the startup performance of a C#/XAML Windows Store app. This app loads and displays pictures from the local pictures library and from the cloud, and we want to see how much time it takes to load images while the app is starting up.

Instead of inserting stopwatch code into our app, we can measure how long loading images takes by setting breakpoints where we would have started and stopped the stopwatch. In this case, we have set breakpoints at the start and the end of the LoadImages method.

0_BreakPointsInEditor

To take our first measurement with PerfTips we start debugging and wait for the first breakpoint to be hit (on line 67). Then we press F5 or click Continue to run to the second breakpoint and see the PerfTips for this section of code:

1_FirstBreakPoint

Here we are at the second breakpoint (on line 80), and the PerfTip is displayed. In this case we have configured PerfTips so that both elapsed time and CPU time are shown:

2a_SecondBreakpoint

From the information presented, we see that the app was running for 2,780ms between these two breakpoints and during that time 985ms of CPU was used on all CPU cores. It makes sense that the CPU time is much less than the elapsed time because the app is not going to use much CPU while it is waiting on network or disk I/O.

Side TIP: Instead of using breakpoints, we could have also taken a similar measurement by using Run To Cursor (Ctrl+F10, or right-click -> Run To Cursor) first with the cursor on the first line of the function and then again with the cursor on the last line of the function.

Now that we know how long this particular code path takes to run, we can run the same scenario again and after the first breakpoint is hit step over each line to see which lines are taking the most time. For those who are stepping over code to measure performance, you can now do the same thing and see more accurate measurements for each step. As we step over the first few lines we see that they take less than 20ms each, but when we step over GetImagesFromCloud (line 72) it takes 1,391ms to run and uses 266ms of CPU time. Here we are before stepping:

3_ReadyToStep

… and then after stepping over GetImagesFromCloud (line 72):

4a_FirstStep

Then we step over LoadImagesFromDisk (line 73) and see that it takes 1,361ms to run and uses 563ms of CPU time:

5a_SecondStep

Finally we step over the LINQ query (lines 75 and 76) and see that it takes 259ms and uses 156ms of CPU:

6a_ThirdStep

As we step over the remaining lines they each take less than 10ms to complete.

By stepping through the code we were able to determine that getting images from the cloud and loading images from the disk (statements on lines 72 and 73) both take about the same time and use relatively little CPU. We can likely improve performance by doing both concurrently, especially since these two operations will not be competing for CPU resources. Instead of waiting on both sequentially, we change the code to kick off both tasks at once and then await on both to complete. Running between the same two breakpoints as before, we measure how long it takes to load images with this change.

7_AfterOptimizing

After the change this section of code completes in 2,079ms which is about 700ms faster than the 2,780ms it took initially, a 25% improvement! This is an example of an optimization you could easily make while you are first writing and debugging this area of code.

If you are optimizing code without measuring at all, you can now take these kinds of quick measurements with PerfTips to make informed choices about where you spend your time. For example, before you may have considered improving the performance of the LINQ query by generating the collection of images manually, but now we know from the PerfTips that this line only takes around 250ms and we can expect to make image loading at most 10% faster by applying such an optimization.

Now that we’ve given you a tour of PerfTips, we’ll go over some best practices that you should keep in mind when using this feature.

Best Practice: Take Multiple Measurements

Performance naturally varies from run-to-run, and code paths typically execute slower the first time they are executed in a given run of the app due to one-time initialization work such as loading .dlls, JIT compiling methods, and initializing caches. Taking multiple measurements will give you an idea of the range and median of the metric being shown and will allow you to compare the first time vs. steady state performance of an area of code.

Best Practice: Verify in Release

If you want to make improvements on pieces of code that run for less than 50 milliseconds, or you are about to invest a lot of time in rewriting code to make it run faster, we recommend that you switch to the Release build of your app and use the tools that are available in the Performance and Diagnostics hub to measure the performance of your app without the debugger. Using the hub tools on a Release build without the debugger will more accurately measure the performance that the users of your app see, and provides a much more detailed breakdown of where time is being spent. If you are not familiar with the hub, you can check back here (or on the on the Diagnostics blog feed) in a few weeks for a follow-up blog post which will show you how to use the hub tools to measure the exact same data that you see in our earlier PerfTips example.

You can also change how you use the debugger in order to make PerfTips more accurate. When you are running with the debugger, there is some overhead included in the PerfTips when the debugger pauses and resumes the target process. While this overhead is normally small (a few milliseconds), the debugger may sometimes have to pause and resume the target process several times during a single step, making the overhead much larger. These pauses can also change the performance characteristics of code when it is executing. For example, cache performance on the CPU will be better when code is executing in a tight loop without breaking. To mitigate these effects you can measure larger sections of code by running between breakpoints as shown in the first example in this post.

In sections of code that are highly sensitive to compiler optimizations, the CPU time shown in debugging or when the debugger is attached can be significantly higher than what it is in release mode. This is because code optimizations such as inlining are not applied and extra validation such as array bounds checking are added in Debug builds. This allows you to have a better debugging experience, but with extra run-time overhead. You can mitigate this by switching to a Release build and debugging the optimized version of the code. If you are debugging managed code you will also need to uncheck “Suppress JIT optimization on module load” and “Enable Just My Code” in Tools -> Options -> Debugging. However, be warned that you will likely encounter breakpoints that fail to set and the yellow arrow moving unexpectedly when stepping. So, if you would like to verify the release performance of your app, as stated earlier, using the tools in the Performance and Diagnostics hub is the recommended approach.

Known Issues in Current VS “14” CTP Releases

We wanted to make this feature available for you to try as soon as possible. In order to do that we have released PerfTips in the early preview builds of Visual Studio “14” without waiting to fix all of the known (and unknown) issues. The following are the issues we know about:

  • First-chance exceptions can add 10s-100s of milliseconds to PerfTips per exception. If you suspect this is the case, glance at the Output window to see if a first-chance exception occurred.
  • Occasionally a PerfTip will not be shown if a large number of modules were loaded since the app was last resumed. Again, the Output window indicates when modules are loaded.
  • When using the PerfTips on the Phone or on VMs, simple lines of code that should complete in well under 1ms show as taking 10s of milliseconds.

If you find any others, please take a moment of your time and let us know so we can fix them before the final RTM release! Read onto the next section to find out how to share your feedback with us.

We Want Your Feedback!

This is a new way of approaching performance analysis that is easy to use and is much closer to how you understand your code when you write it and debug it. We think you’re going to like PerfTips but we would like to make it as useful as we can to you. If you have feedback for us, you can immediately send us general comments in Send-a-smile, feature requests on UserVoice, and submit issues you run into on the Forums.

In addition to that, you can sign up now to receive a PerfTips survey and send us your in-depth evaluation of the feature.

Leave a Comment
  • Please add 6 and 4 and type the answer here:
  • Post
  • I think this is a helpful mechanism and better than me having to use printf() style data collection, but I fail to see the great value in this feature. As a performance engineer evaluating performance during debugging seems asinine and by your own admission you mention switching over to Release mode to measure performance. So am I now going to debug my code in Release just to get performance tips or do I leave it in Debug and accept the performance tips are not really that accurate? This seems like a feature that was not fully thought through. I will reserve final after I play with it, but I am dubious of what I can actually use these values for.

    Thanks!

  • Does this work for native c++ code?

  • I just installed VS 14 CTP3, but can't enable this feature. Is there a setting somewhere? Found the perf tab in debugging, but can't enable it there, only configure. I get absolutely no perf info no matter how i navigate the code while debugging.

  • I tried it under a managed C# project (targeting .net fw 4.5) for the record

  • Fantastic - I've been asking MS for such a facility for years and had all but given up hope that it'd ever be done.

    Thank you, I'm really looking forward to trying it.

    In response to Thomas's point - this is not meant to be an accurate performance tool, but I'm 99.9% sure from my experience that it will highlight design performance issues early on when developers are prototyping/developing code.

    One query on the implementation...

    "... the debugger will show a PerfTip on that line of code if there are any functions on the call stack that were also on the call stack when the program was last started or resumed"

    Although the example given (a button handler) isn't suitable for performance measurement, I can envisage others where the breakpoint won't be in the same call stack.

    Was this behaviour a conscious design decision, or a consequence of the design?

  • attempted a simple C# console application and WPF application too. couldn't enable perftips in any way.

  • @Mike, yes this works for native C++ code!

    @P, you should not need to enable any settings in order to see PerfTips while you are debugging, there may be an issue related to your particular development environment. Can you reach out to me at dantaylo [at] microsoft.com and we can nail down what the issue might be?

    @DaveL, thanks for the positive feedback! The call stack behavior was a conscious design and it is something we can change if needed. Can you share with me some cases where you would want to do performance measurements and the breakpoints are not on the same stack?

  • Hi!

    I don't have the change to try the bits now, but does this work with F#? Or perhaps a better question would be, how is this feature implemented under the hoods?

    I'd think this is a really useful feature to get some figures early on, as mentioned already, and see if something changes drastically, and just plain out working with some variations of the code.

  • Hi Dan, regarding my question on the call stack behaviour, my immediate thought was in a multi-threading situation, where you might be at the point in one thread where an event is set, and have a breakpoint in the thread that handles the event.

    Fundamentally, limiting the timing points to be in the same call stack seemed to me to be an arbitrary limitation - but of course I don't know what your implementation/constraints are.

  • @Veikko: this does work with F#, thank you for asking! The feature uses the default Windows .NET, C++, and JavaScript debuggers to time the application's run time and so it will work for any managed language in most cases. This is not implemented in our legacy debugging engines however, so if you are using managed compatibility mode or managed C++ PerfTips will not be available.

    @DaveL: the reason for the restriction was that we did not want to show you PerfTips when you hit a breakpoint in code that is completely unrelated to the code you were debugging at the last breakpoint -- we thought this would be confusing. However, for the event handler example you provided the call stack is not a good indicator of code being related at each breakpoint. We will re-evaluate the call stack restriction with this example in mind, thanks!

  • Let me guess — VS Galactic Edition only?

  • @Eric: I hope it is not in the Galactic SKU! As you can see, at least for the CTP 3, it is in the Pro SKU. Having said that, I work on the product team that builds (hopefully useful) features, and it is the marketing team that makes SKU decisions, and they make those later in the release cycle. If you have any feedback on the feature itself please send it my way by leaving additional comments here or using the feedback mechanisms listed at the end of the blog post.

  • I've found time to have a brief look at this facility. A couple of thoughts/suggestions:

    1. Enable both the elapsed and CPU display by default. Currently CPU consumption is not enabled on a clean installation - many people will remain ignorant that you can show it.

    2. Can the time display be more flexible - showing μs, or ms (whichever is most appropriate to the current value)? Currently, anything sub-1ms shows as 0, which you know it can't be. :)

  • (reposting comment from earlier with a correction)

    @Thomas, we intend for the PerfTips to be used in debug mode even though they are less accurate (as discussed in the post, this maintains a good debugging experience). The tradeoff is that they serve as an “early warning system” for performance issues and help you to get an order-of-magnitude sense of performance while you are already debugging code for feature development purposes. Many developers would otherwise not be making any performance measurements of their app at this time and this can lead to performance issues not being caught until way too late and/or premature optimization of code.

    If you see a number that you are happy with while debugging, assuming you are in a representative environment, you can conclude that you do not need to investigate further or optimize this area of code. If you see a number that is higher than you would like, then it is a flag that further investigation may be needed using the more accurate performance tools. As a performance engineer, you may be spending most of your time in that “further investigation is needed” camp and so in that case we would intend for you to primarily use the tools in the Performance and Diagnostics hub. Please do try out PerfTips in the CTP3 and sign up for the survey to give us your evaluation of the feature. We especially want to hear about any way that we can change PerfTips to them to make them more useful to you in the upcoming release!

  • @DaveL: really appreciate you trying out the feature and the feedback! On #1, I agree that CPU time is an important part of understanding performance but we have to balance that against making the information is easy ingest the first time you see the feature (especially for those who are completely unfamiliar with performance measurements). At the moment we are being conservative and only showing elapsed time, but we may decide to change that later.

    On #2, at the moment we don't have a microsecond level of precision (because of debugger overhead), but if we managed to get to microsecond level of precision we would definitely show it. You are correct that we should never show 0ms, though. We'll fix that.

Page 1 of 1 (15 items)