GPU profiling in Visual Studio 2013 Update 2

GPU profiling in Visual Studio 2013 Update 2

  • Comments 10

The graphics debugging tool formerly known as PIX has been integrated into Visual Studio for a while now, and gets better in every release.  But unlike Xbox PIX, the Windows incarnation of this technology has until now been only for debugging and not profiling.  It provided lots of information about what happened, but none at all about how long things took.

For Windows Phone 8.1, my team (hi Adrian!) added the ability to measure and analyze GPU performance.  I’m particularly proud of the fact that, thanks to our efforts to make the Windows and Phone graphics stacks as similar as possible, we were able to build this new feature focusing mostly on Phone, yet the resulting code works exactly the same on full Windows.  Visual Studio is even able to reuse a single version of our GPU performance analysis DLL across both Windows 8.1 and Phone 8.1.

Rong’s talk at the Build conference shows this in action, and you can download Visual Studio 2013 Update 2 RC to try it out for yourself.

 

Here’s how it works.  I opened the default D3D project template, which gives me an oh-so-exciting spinning cube plus a framerate counter in the bottom right corner:

 

image

 

To use the graphics diagnostics feature, open the Debug menu, click Graphics, and then Start Diagnostics:

 

image

 

This will run the app with D3D tracing enabled.  Press the Print Screen (PrtSc) key one or more times to capture the frames you want to analyze.  When you quit the app, Visual Studio will open its graphics debugger.  This will look familiar if you have used PIX before, but the UI is considerably improved in this release, plus it now supports Phone as well as Windows:

 

image

 

So far so good, but where is this new profiling feature?   Select the Frame Analysis tab, and click where it says Click here:

 

image

 

Our new analysis engine will whir and click for a while (the more complicated your rendering, the longer this will take).  When everything has been measured it shows a report describing the GPU performance of every draw call in the frame:

 

image

 

This simple app only contains two draw calls.  Event #117 (DrawIndexed) is the cube, while #137 (DrawIndexedInstanced) is the framerate counter.  There would obviously be a lot more data if you analyzed something more complicated, in which case the ID3DUserDefinedAnnotation API can be used to organize and label different sections of your rendering.

The blue bars near the top (labeled Time) show how long each draw call took for the GPU to execute.  Clearly our cube is much more expensive than the framerate text  (although both are ridiculously quick – this template isn’t exactly stressing my GPU :-)   The column titled Baseline shows the numeric duration of each draw, and the other columns show a series of experiments where we changed various things about the rendering and measured how much difference each one made to the GPU.  For instance this data tells us that:

  1. Shrinking the output viewport to 1x1 reduced GPU time to just 2% of the original.  This means we are heavily fill rate limited, so a possible optimization would be to reduce the backbuffer resolution.
  2. Turning on 2x or 4x MSAA slowed things down, but only by ~10% – worth considering whether we can afford that slight perf hit in exchange for the quality improvement?
  3. Reducing the backbuffer from 32 to 16 bit format gave only a small improvement.
  4. Automatically adding mipmaps to all the textures, or shrinking all the textures to half size, did not significantly affect performance, so we know this app is not bottlenecked by texture fetch bandwidth.

There are a couple of different forms of color highlighting going on in this report:

  1. The background of the first draw call is light red to show it was one of the more expensive draws in the frame, and therefore the part worth concentrating on.
  2. The most statistically significant differences produced by the various rendering experiments are highlighted in green (for improvements) or red (for changes that hurt performance).  Numbers that are not highlighted indicate that, although we did measure a change of performance, this may just be random measurement noise rather than a truly significant change.

Move the mouse over any of these numbers to a view a hover tip showing more data about that particular measurement.

 

“Sounds great!  So what types of device can I use this stuff on?”

  1. The debugging part of this tool works on all Windows 8.1 and Phone 8.1 devices.
  2. Performance analysis requires the graphics driver to support timestamp queries, which was not part of Windows Phone 8.  This will work on Windows, and on newer 8.1 phones once those are available, but it will not work on existing phones  (even when they are upgraded to the 8.1 OS, their older drivers will be missing the necessary query ability)
  3. New 8.1 phones will also report GPU counter values directly from the driver, which gives much richer information about what is going on inside the GPU.
  • Great work! Is this going to VS Express btw?

  • It would be nice for DirectX 11 desktop applications as well.  They require just as much GPU debugging and profiling love.

  • Visual Studio supports desktop apps just the same as Store ones.

  • Yes, this feature is part of VS Express as well.

  • In VS 2013 Express for Desktop with Update 2 RC there is no Graphics sub-menu in the Debug menu.

  • Wow this looks fantastic, I will be using it to profile my game on all the platforms its on now!

    2 Questions.

    Will it profile the directx calls in a desktop XNA 4.0 app? Most of my games are XNA 4.0 based for the foreseeable future.

    Does it support (or will support) the Xbox One?

  • This tooling supports desktop as well as store apps, but it's only for D3D11 while XNA uses D3D9.

    Xbox One has its own somewhat different version of PIX, which does many similar things but in a more hardware specific way  (many things get easier, and some richer types of data can be obtained, when you don't have to worry about supporting more than one type of GPU with different drivers!)

  • This appears to work only in pure D3D apps. XAML and D3D apps can get graphics debugger to show, but capturing the current frame(s) never works. It is stuck on "waiting for DirectX to finish frame".

  • I have tried this feature and its a great performance analysis tool. But I am wondering why it shows timing for only draw calls and not dispatch calls from DirectCompute. This should be something you could add in the coming updates. I have also tried recently released update RC 3 it won't have this feature either.

  • Thanks for the feedback Khantil!

    We didn't support compute shader profiling mostly due to lack of time, but also prioritization as this feature was initially focused mostly on Windows Phone, which doesn't support compute shaders at all!  Would definitely be a good thing to add in the future, though, and thanks for sharing your input on this.

Page 1 of 1 (10 items)
Leave a Comment
  • Please add 1 and 6 and type the answer here:
  • Post