Tuning C++ build parallelism in VS2010

Tuning C++ build parallelism in VS2010

  • Comments 26

A great way to get fast builds on a multiprocessor computer is to take advantage of as much parallelism in your build as possible. If you have C++ projects, there’s two different kinds of parallelism you can configure.

What are the dials I can set?

Project-level parallel build, which is controlled by MSBuild, is set at the solution level in Visual Studio. (Visual Studio actually stores the value per-user per-computer, which may not be always what you want – you may want to have different values for different solutions, and the UI doesn’t allow you to do that.). By default Visual Studio picks the number of processors on your machine. Do some experiments with slightly higher and lower numbers to see what gives the best speed for your particular code. Some people like to dial it down a little so that they can do other work while a build goes on.

image

This dial is just the same as VS2008, although under the covers MSBuild is taking over some of the work from Visual Studio now.

If you’re building C++ or C++/CLI, there’s another place you can get build parallelism. The CL compiler supports the /MP switch, which tells it to build subsets of its inputs concurrently with separate instances of itself. The default number of buckets, again, is the number of CPU’s, but you can also specify a number, like /MP5. Again, this was available before, so I’m going to just remind you where the value is and what it looks like in the MSBuild format project file.

Go to your project’s property pages, and to the C/C++, General page. For now I suggest that you select All Configurations and All Platforms. You can be more selective later if you want.

image 

As usual you can see what’s in the project file by unloading it, right clicking on the node in the Solution Explorer, and choosing Edit:

image

Here’s what it looks like in the project file. Yes, it’s inside a configuration and platform specific block, but it put the same value in all of them.

image

 

 

 

 

Notice that it’s in an “ItemDefinitionGroup”. That MSBuild tag simply indicates it defines a  “template” for items of a particular type. In this case, all items of type “ClCompile” will automatically have metadata MultiProcessorCompilation with value true unless they explicitly choose a different value.

By the way, MSBuild Items, in case you’re wondering, are just files, usually. Their subelements, if any, are the metadata. Here’s what some look like. Notice they’re in an “ItemGroup”:

image

Because this is metadata, at an extreme, I could actually set this down to a per-file basis. In that case, MSBuild would bucket together all the inputs that have a common value. You would need to disable /MP for particular files that use #import, for example, because that's not supported with /MP. (Other features not supported with /MP are /Gm, which is incremental compilation, and a few other switches documented here)

Note it’s under the “ItemGroup” because these are actual items:

image

Back to multiprocessor CL. If you want to tell CL explicitly how many parallel compiles to do, Visual Studio lets you do this – as for /MP, it's exposed as a global setting:image

Under the covers, VS passes this on by setting a property (a global property – it's not persisted) named CL_MPCount. That means it won't have any effect when building outside of VS.

If you want to choose a value at a finer grained level you can’t use the UI as it’s not exposed in the property pages or the command line preview. You have to go into the project file editor and type it. It’s a different piece of metadata on the CLCompile items, named “ProcessorNumber”. It can have a number from 1 to as high as you like and adds the numeric value to /MP if you want it. If you don't have <MultiProcessorCompilation> it will be ignored.

image

The squiggle here is a minor bug – ignore it.

What about building on the command line?

The /MP settings come from the project files, so they work exactly the same on the command line. That’s part of the whole point of MSBuild, right, the same build on the command line as in Visual Studio? But the global parallelism setting that you set in Tools, Options does not affect the command line. You must pass it yourself to the msbuild.exe command with the /m switch. Again, the value is optional and if you don’t supply a value it uses the number of CPU’s. However, unlike Visual Studio, out of the box, without /m supplied, it uses 1 CPU. That might change in future.

image

To choose the number on any /MP value, you can set an environment variable, or pass a property, named CL_MPCount, just like Visual Studio does.

 

 

 

Setting /MP on every project is tiresome, what are my options?

Probably you’ll want to use /MP on more than one of your projects, and you don’t want to edit each individually. The Visual Studio solution to this kind of problem is property sheets. They don’t have any special connection to multiprocessor build, but it’s an opportunity for me to give a quick refresher using this as an example. First open the “Property Manager” from the view menu. Its exact location will vary depending on the settings you’re using, here’s where it is if you have C++ settings;

image

Right click on a project and choose “Add New Property Sheet”:

image

I have mine the name “MultiprocCpp.props”. You’ll see it gets added to all configurations of this project. Right click on it, and you’ll see the same property pages that the project has, but this time you’re editing the property sheet. Again, set “Multi-processor Compilation” to “Yes”. Close the property pages, select the property sheet in the Property Manager, and hit Save.

Now I can open up that new MultiprocCpp.props file in the editor, and I see this:

image

(Again, ignore the squiggle.)

Looking in the project file, you can see the property sheet pulled in to each configuration, using an “Import” tag. Think of that just like a #include in C++:

image

So now we have the definition we put in the project file before, but in a reusable form. Given that, I can put it into all the projects I want in one shot, by multi-selecting in the Property Manager and choosing Add Existing Property Sheet:

image

Now all your projects compile with /MP !

In some circumstances, you might want to go beyond what you can easily do in the Property Manager. For example you might want to bulk-remove a property sheet, or put a property sheet in each project once outside of all the configurations. Fortunately MSBuild 4.0 has a powerful and complete object model over its files that you can use to do this kind of work in a few lines of code. More on that in a future blog post, but for now, if you want to take a look, point the Object Browser at Microsoft.Build.dll.

Before I leave property sheets, it’s worth mentioning that you can do this kind of common-importing in your own ways, if you don’t mind losing some of the UI support. For example, in the build of VS itself, we pull in a common set of properties at the top of every project, like this example from the project that builds msenv.dll (which contains much of the VS shell)

image

Within that we define all kinds of global settings, and import yet others. I’ll talk about this kind of structure in a future blog post about the organization of large build trees.

Too much of a good thing

Usually the problem is getting enough parallelism to exploit all your machine’s cores. But the reverse problem is possible, and although it’s a nice problem to have, it needs fixing because it will cause your machine to thrash. Here’s what task manager might look like when this is happening:

image

In this case on a box with 8 CPU’s I enabled /MP on all my projects in the solution, and then built it with msbuild.exe /m (I didn’t need to use the command line to have this problem, the same could happen in Visual Studio). If dependencies don’t prevent it, MSBuild will kick off 8 projects at once, and in each of those CL will run 8 instances of itself at once, so we could have up to 64 copies of CL all fighting over my cores and my disk. Not a recipe for performance.

You can expect that one day the system will auto-tune itself here, but for now if you have this problem you would do some manual adjustment. Here’s some ideas:

Dial down the values globally

Reduce /m:4 to /m:3, for example, or use a property sheet to change /MP to /MP2, say. Easy, but a blunt instrument: if there are points elsewhere in your build where there is a lot of project parallelism but not much CL parallelism, or vice versa, you probably just slowed them down.

Tune /MP for each project and configuration

A project that compiles at a relatively parallelized point in the build is not such a good candidate for /MP, for example. You might adjust by configuration as well. Retail configuration can be much slower to build because the compiler’s optimizing more: that might make it interesting to enable /MP for Retail and not Debug.

Get super custom

In your team, you might have a range of hardware. Perhaps your developers have 2-CPU machines, but your nightly build is on an 8-CPU beast. Yet the both need to build the same set of sources, and you don't want any box to be either slow or thrashing. In this case, you could use environment variables, and Conditions on the MSBuild tags. Almost all MSBuild tags can have Conditions.

Here’s an example below. When a property “MultiprocCLCount” (which I just invented) has a value, and it’s greater than 0, /MP is enabled with that value.

image

MSBuild pulls in all environment variables as its initial properties when it starts up. So on my fast machine, I set an environment variable MultiprocCLCount=8, and on my developer boxes, I set MultiprocCLCount=2.

The build machine’s script could also parameterize the /m switch going to MSBuild.exe, like /m:%MultiprocMSBuildCount%

To other properties that might be useful in exotic conditions: $(Number_Of_Processors) is the number of logical cores on the box – this just comes from the environment variable. $(MSBuildNodeCount) is the value that was passed to /m on msbuild.exe, or within VS, the value from Tools>Options for project parallelism.

That’s it. I hope while walking you through /m and /MP I’ve also given you an overview of some MSBuild features and how much flexibility they give you to configure your build process.

Optimizing your build speed is a huge topic so look for more blogging on this subject from me.

Dan – Developer Lead, MSBuild

Leave a Comment
  • Please add 3 and 3 and type the answer here:
  • Post
  • As for why you might want to set this on a file-by-file basis: you can't build files with #import or your precompiled headers with /MP, so you then need to turn off MP on those files.

    I was hoping that VS2010 having everything done in MSBuild would mean that there would be only one place to set the parallelism, and that we wouldn't have to set both inter-project and intra-project parallelism.  Alas, that does not appear to be the case.

  • Greg, thanks for pointing that out. There are 3 or 4 cases where /MP isn't allowed. I've updated the text with that example.

    Auto-tuning parallelism is definitely on our radar for a future version. Ideally we could do it without knowledge of what parallelism tools are doing. For /MP for example, the MSBuild engine itself does not know about it. For other tools, it might be built-in, and the number threads the tool uses might not even be deterministic. So the solution in part might involve watching CPU and disk counters and backing off concurrency when the machine approaches saturation. We've been talking to the parallel computing platform folks (http://msdn.microsoft.com/en-us/concurrency/default.aspx) to see what they can do to help us here. I believe they already look at CPU saturation, but not IO, and they do not coordinate between PCP users in different processes: MSbuild is multi-process.

    Look for more paralellism to be added to VS itself in future versions, as well.

    Dan

  • What I was thinking was that MSBuild would be the piece in ultimate control, and it would do all the division into different CL invocations itself, instead of just launching one CL for the entire project (which is what it sounds like it is doing above).

  • We do and we don't :)  In MSBuild 4 we added a lot of capability internally to allow MSBuild to be able to deal with just these sorts of scenarios, but we did not have time to implement all of the goodies which exploit that capability - only some of them and this was one which didn't make it this time around.  For instance, if you could say in your project that you can run a batched task in parallel, then you could take each of your input files, run a CL.exe for each one (batching on the %Item.Identity metadata for instance), and it would automatically limit the number that were run among all MSBuild nodes to ensure the saturation example above doesn't occur.  

    This does require a slight change to the C++ project files, as well as us implementing parallel batching support in MSBuild, but these are not big changes now that we've done the hard work internally.  We are definitely looking at this and other scenarios to ensure MSBuild can fully utilize your processing and I/O subsystems as automatically as possible.

    - Cliff

  • Are there any plans to include a capability like incredibuild's distribution in upcoming editions of Visual Studio?

  • This is disappointing to say the least.

  • I second Jonathan Moore's rq. We really need distributed c++ builds! Sadly, our Mac XCode builds slaughter our VC build times thanks to distribued gcc. This should be built-in and not a (quite expensive) bolted-on tool.

  • Another vote for incredibuild like integration with visual studio.  For large applications like the software we develop it's invaluable in order to keep build times small.  Without incredibuild it takes around 40 minutes to build one configuration.. and yet with it - 3 minutes!

    We love incredibuild - but since they've started charging for extra cores we've been hoping VS would come along and provide some competition!

  • Thanks for the feedback. If anyone's interested in providing more data, some along these lines would be useful;

    How much of the performance improvement would you get if distribution of the compile was done at the project granularity - in other words, is it important to be able to distribute subsets of the sources of single projects as well? (Perhaps if you are doing a full clean build, the project granularity is sufficient?)

    I understand that Incredibuild doesn't much distribute linking. Roughly what proportion of your build times are linking rather than compilation? Do you use /LCTG (this can make linking much longer, and more CPU bound)? How IO dominated is your linking, rather than CPU dominated? For your source tree, is the linking part of a typical build distributable - in other words, there are several distinct linker outputs?

    Some idea of numbers is always useful: numbers of source files, number of projects, number distinct linker binary outputs? What your builds tend to do - full clean builds, or build just a few cpp files. How long your builds take.

    Dan

    Dan

  • We looked at incredebuild 4 years ago, and then, it didn't distribute linking properly, (and our linking times where large).

    But I think they were working on it. Also, with incremental linking, linking is quite fast.

    Keep up the good work tho. Every bit helps.

    Our builds are 5-45 secs incremental, or 10-20 mins from clean (argh).

    Also, parallell support for batch builds would've been nice.

  • For us:

    Roughly 3% of the build time is linking / link-related activities.

    Raw build time is 10 hours, link time is 17 minutes (on single core, lots of RAM, fast disk IO).  About twice the time is needed for a debug build.

    Linking is slower on lower-specced machines.

    20,000 source files (half .cpp, half .h, roughly)

    344 projects, similar number of built libraries.

    Probably 75% of builds are incremental builds for a few .cpp files.  About half the rest are incremental builds for .h file changes that require recompiling a hundred or so files.  The rest are complete rebuilds triggered by either a cross-merge of branches or change to a widely used .h file.  If rebuilding more than >10% of the code base full rebuild is faster than incremental.

  • Jonathan, this comment surprises me

    "If rebuilding more than >10% of the code base full rebuild is faster than incremental."

    Is this the case in VS10 (in which the build engine and incremental mechanism is completely different)? If so that's concerning.

    If it's not VS10, what incremental times do you get? A good benchmark is a complete no-op build, then a typical build is that time plus a bit of compilation. my hope is that we have very fast incremental.

    Dan

  • Following along from the last comment -- when you do those incremental builds (which are slow for you) are any tools running that don't need to ? (I usually just grep for ".exe" in the log)

    Dan

  • I should have been clearer; we currently use incredibuild/vs2005.  We're hoping to move to vs 2010 later this year.

    The ">10%" comment applied to that setup, where a full distributed build (across 30 agents, without debug symbols) takes around 20-35 minutes depending on the machine, and a single cpp incremental build takes < a minute in VS if we just relink or 2-3 minutes if we use the incredibuild/vs2005 combo and do proper dependency checking.  I think it's likely that the odd performance behavior is down to incredibuild rather than VS2005.

    The main thing running that might affect this is an antivirus service (as developers we'd like to get rid of it, but it is well and truly mandated by the powers that be).  

    The new incremental build mechanism is definitely something we will look forward to investigating.  If it works well, then it could make parallel builds on a single box through VS be the best way to do nearly all incremental builds.

  • So VS2010 is basically the same as VS2008 as far as parallel building.

    We have a 60 project solution, with millions of LOC, that builds in roughly 8 minutes, on an 8-core box. We have /MP and parallel projects turned on.

    The bottleneck now is building PCH's. Each project has its own PCH, and because of the way Visual C++ works, it is not possible to reuse PCH's across projects. This has been reported since at least VS2005 - I guess it will never be fixed...

    Incredibuild can be a useful patch for codebases that are poorly structured to begin with. In the long run it's better to restructure the codebase. Minimize dependencies, keep header files small, use PCH's.

    Regarding how to set the /MP flag, we no longer maintain VS solutions in VS, because of the tediousness of doing such things. We now have a cmake-generated solution, and the difference is like night and day.

Page 1 of 2 (26 items) 12