Quick Tips On Using Whole Program Optimization

Quick Tips On Using Whole Program Optimization

  • Comments 38

Hi, I’m Jerry Goodwin from the Visual C++ code generation and optimization team, with a couple quick tips on using Whole Program Optimization, also referred to as Link Time Code Generation (LTCG).

 

If you’re writing native C++ code, you can typically speed up your optimized code by about another 3-4% by adding the /GL flag to your compiles.  This flag tells the compiler to defer code generation until you link your program. Then at link time the linker calls back to the compiler to finish compilation. If you compile all your sources this way, the compiler optimizes your program as a whole rather than one source file at a time. For users building with the IDE, this option is found under the C/C++ optimization settings, and is already on by default for retail builds in new projects.

 

Using Whole Program Optimization provides the optimizer with a number of extra optimization opportunities, but I’ll give just one example. Many people are already familiar with the benefits of inlining a called function into the caller. We can only do inlining when we are generating code for both the calling function and the called function at the same time. With Link Time Code Gen we can inline functions from one source file into callers defined in another source file, as long as both source files were compiled with /GL.

 

If you do use /GL, here are four caveats to keep in mind:

 

1.       When building from the command line or via makefiles, you need to add the /LTCG switch to the link command line to tell the linker to expect to see one or more object files that were compiled with /GL. If you don’t, some build time will be wasted because the linker will have to start over when it gets to the module compiled with /GL. If you build through the IDE this is in your project configuration settings on the Linker optimization page.

2.       Using /GL reduces your compile times, but your link time will increase, because work is being moved to during the link. Overall build time might increase a little, but shouldn’t increase a lot.

3.       Don’t compile managed code with /GL. Link time code gen provides little or no benefit to managed code, and this option combination (/GL /clr) is being removed in the next compiler release, so you can future-proof your build by using link time code generation only for native code. If you’re building managed code using the IDE, the default setting is to use /GL in release builds, and I recommend you disable it for managed code. For mixed managed and native code, compiling only the native code with /GL and linking with /LTCG gives best results.

4.       Never use /GL for code you intend to put in a library and ship to your customers. Doing so means that your customers will be doing the code gen for your library when they link their application. Since some of your customers could have different versions of the compiler, shipping a lib built this way could cause various maintenance problems for you. If your customer’s compiler is from a prior release, their link may fail. If their version is newer than yours, the code they generate won’t be exactly equal to what you’ve tested, and could behave differently for them than when you tested it. In VS 2008, the IDE default for the class library template release configuration is to build using /GL, and I strongly encourage everyone to reset that.

 

Here are links for more information on this topic:

 

The /GL compiler switch (http://msdn.microsoft.com/en-us/library/0zza0de8.aspx)

            The /LTCG linker switch (http://msdn.microsoft.com/en-us/library/xbf3tbeh.aspx)

            A detailed article about Link Time Code Generation (http://msdn.microsoft.com/en-us/magazine/cc301698.aspx)

 

  • PingBack from http://www.clickandsolve.com/?p=13876

  • That, and there's a bug with /GL and /LTCG that occasionally causes the build to hang when combined with /MP (use multiple processors).

    I reported this on Connect a while ago, and a fix is now apparently available via PSS.

  • "Source file" here is not qualified.

    Does it mean both .h and .cpp files?

    I was pretty sure one of the options didn't like header only implementation.

  • It means whatever is listed on the command line when passed to the compiler, typically a .c or .cpp file (though any file type can be compiled using the /Tp or /TP flags). The compiler produces one .obj file for each source file compiled.

  • Then I must have confused it with another company that said that only if you seperate your headers and implementation in .cpp files you would benefit from some XYZ Company VC++ optimisation feature..

    I don't see many obj files produced for header files with implementation/templates/etc to be frank.. (reads: any).

  • in #4 "shipping a lib built this way" by lib I am assuming you mean a static library, not an import library. Correct?

  • Can we get the compiler/linker to emit a message stating the function X was automatically inlined?

  • re #4, yes I was talking about static libs. If you are shipping a .dll with an import library, you can use LTCG to build the .dll and it should be beneficial. The import lib will be fine, too. Thanks for the clarification.

  • In a non-LTCG compile, we already have W4-level warnings that indicate when a function you didn't ask to be inlined was inlined (C4711) and for the case where you did ask for something to be inlined but it wasn't (C4710). There's also the C4714 case where a function you marked __forceinline can't be inlined for some reason.

    These messages are emitted during code generation, so in the LTCG case you'd see the same messages emitted during the link step. But to enable them you have to turn them on when compiling, because they're compiler command line options, not linker command line options. All the options the compiler would normally pass to the code generator in a regular compile are saved in the /GL-format .obj file and then the code generator honors them when the linker calls the code generator at link time. Another option like this is /wdNNNN (which you might want to use along with /W4 to filter warnings you aren't interested in).

  • A slight correction: C4711 is off by default, so you'd need to use /Wall to get it turned on from the cl.exe command line. Another way would be to add this to your source files:

    #pragma warning ( 1 : 4711 )

  • Of course, any 3-4% performance gain is swamped by the possible 30-40% performance loss resulting from a bug introduced in Visual Studio 2008 SP1. The last response to reports of this bug via Connect was that the performance loss is "by design" (Connect bug ID 389232). Also see IDs 383764 and 402589.

  • LOL. Do you have the URLs to the Connect items?

    I've noticed this and other compilers can really generate better especially CRT related code..

    And what if we really cannot retype, repeat and repeat all the template qualifications and use only a header file?

  • http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=389232

    That bug is not compiler-related, it's IDE-related.

  • Sounds good in theory, but in reality VC++ 6.0 still compiles faster, smaller, code than 2005/2008/2010 in almost all situations (I've yet to find a situation where it doesn't, but that doesn't mean one doesn't exist.)

  • I have just installed VS 2008 alongside my old VS 2005, all of them fully updated (service packs) and it seems the VS 2008 cl.exe (Compiler Version 15.00.30729.01) is slower than the VS 2005 one (Compiler Version 14.00.50727.762). Is it correct? I have tested it with a lot of projects and it seems the 2008 one is about 30% slower than the old compiler. I am really disappointed :(

    Example of cmd line (have tested tons of options but the new compiler is always slow):

    cl /O2 /I "G:\Boost-1.38.0\include" /I "G:\zlib-1.2.3\include" /I "../Include" /D "WIN32" /D "NDEBUG" /D "_LIB" /D "_UNICODE" /D "UNICODE" /FD /EHsc /MD /Fo"Release\\" /Fd"..\..\Release\vc90.pdb" /W3 /c /Zi /TP /MP "file1.cpp" "file2.cpp" "file3.cpp" "file4.cpp" "file5.cpp"  "file6.cpp" "file7.cpp" "file8.cpp"

Page 1 of 3 (38 items) 123