A compiler can only optimize away data or a function if a compiler can prove that the data or function will never be referenced. In a non-LTCG compile (i.e. a build with Whole Program Optimization (WPO) disabled) the compiler's visibility is only limited to a single module (.obj), so for data and function that has global scope, the compiler will never know if other modules will be using them. As a result of this compiler can never optimize them away.
Linker has a good view of all the modules that will be linked together, so linker is in a good position to optimize away unused global data and unreferenced functions. The linker however manipulates on a section level, so if the unreferenced data/functions are mixed with other data or functions in a section, linker won't be able to extract it out and remove it. In order to equip the linker to remove unused global data and functions, we need to put each global data or function in a separate section, and we call these small sections "COMDATs".
Today using the (/Gy) compiler switch instructs the compiler to only package individual functions in the form of packaged functions or COMDATs, each with its own section header information. This enables function-level linkage and enables linker optimizations ICF (folding together identical COMDATs) and REF(eliminating unreferenced COMDATs). In VS2013 (download here), we have introduced a new compiler switch (/Gw) which extends these benefits (i.e. linker optimizations) for data as well.
For further understanding let us take a look at an example below. Feel free to try them out yourselves:
Figure 1: Linker optimizations (i.e. REF) triggered from using the /Gy compiler flag
If a user compiles the code snippets in figure 1 (foo.cpp and bar.cpp) with/without the /Gy compiler flag and subsequently links (link /opt:ref /map foo.obj bar.obj) it with the linker optimizations enabled (i.e. /opt:ref), in the resultant map file generated one can observe that function 'foo' have been removed. However one can still observe the occurrence of global data 'globalRefCount' in the map file. As mentioned before, /Gy only instructs the compiler to package individual functions as COMDATs and not data. Additionally, supplying the /Gw compiler flag in addition to the /Gy flag allows packaging of both data and functions as COMDATs allowing the linker to remove both function 'foo' and 'globalRefCount'.
Given that with LTCG enabled, the compiler visibility extends beyond that of a single module it might not be obvious to understand what a user might gain from enabling this feature with WPO builds. For example, if you compile the example depicted in figure 1 with WPO the compiler can optimize away both the function 'foo' and data entity 'globalRefCount'. However if the example described above is slightly changed to what's depicted in the figure below, then just compiling with WPO does not help. Once an address of a global variable is taken it is very hard for the compiler to prove that the global is not read or written to by other functions in the magic world of pointers and the compiler gives up optimizing such scenarios even with WPO enabled.
But with the help of /Gw, linker can still remove unreferenced data entities here, because linker's REF optimization will not be blocked by things like address taken. Linker knows precisely whether it's referenced or not because any reference to global data would show up as a linker fixup (coff relocation), and that has nothing to do with address taken or not. The example below may look like some hand-made case, but it can be easily translated to real world code.
Figure 2: Address of a global variable is taken
With and with only WPO enabled builds, we also benefit from linker ICF optimization (link /ltcg /map /opt:icf foo.obj bar.obj /out:example.exe) besides REF when /Gw is on. If we take a look at example depicted in figure 3 below, without /Gw, there'll be two identical 'const int data1, const int data2' in the final image. If we enable '/Gw', 'data1' and 'data2' will be folded together. Please note, the ICF optimization will only be applied for identical COMDATs where their address is not taken, and they are read only. If a data is not address taken, then breaking address uniqueness by ICF won't lead to any observable difference, thus it is valid and conformant to the standard.
Figure 3: Linker ICF Optimization for Data COMDAT
To summarize, with '/Gw' compiler switch we now enable linker optimizations (REF and ICF) to also work upon unreferenced and identical Data COMDATs. For folks already taking advantage of function-level linkage this should be fairly easy to understand. We have seen double digit gains (%) in size reduction when enabling this feature for building binaries which constitute some high volume Microsoft products so I would encourage you folks to try it out as well and get back to us. At this point you should have everything you need to get started! Additionally, if you would like us to blog about some other compiler technology please let us know we are always interested in learning from your feedback.
Wow, this is something I've been hoping you would do for a long time. Thanks for making this available!
Does this have any implications for edit and continue, I know /Gy is needed for edit and continue to work and when working with edit and continue you often get "edit and continue does not support changes to data types" does separate packing of data mean that it will be possible for edit and continue to make changes to data types?
When using edit and continue I often get these errors, even when not really changing the object it mentions. If this could help resolve those problems that would be amazing.
Thanks for the post!
I have been using VC++ for years now (VC6...) and I easily admit I did not follow all the great additions you guys have been coding in the compilers (did not miss C++11, LTCG or PGO though :-)).
Please more of those! We never have enough knowledge of what is available!
@Mike, Jeanga: Thanks :).
@edl_si, I doubt if this will have any implications in terms of improvements to Edit and Continue. What version of Visual Studio are you using ? I can try to understand why you see 'EnC does not support changes to data-types'. Are you changing global types ?
Thanks for /Gw switch. I am very interested about learning where are you guys with loop-interchange optimization? Is it even on your long-term roadmap.. like 2020?? Apparently, Intel's compiler already has it stackoverflow.com/.../863980
Thanks for the response Ankit.
This is Visual Studio 2010, but the issue reproduces in 2012 and 2013 RC too.
I'm not changing anything, there are some files (if they include certain headers) that if edit and continued with a null change (say just inserting a space in the cpp) will fail to compile, here is the sort of error we get (the types involve can vary a bit depending on file you try and change):
small_object_stl_allocator.h(200) : error C2220: warning treated as error - no 'object' file generated
small_object_stl_allocator.h(200) : warning C4656: 'std::allocator<_Ty>' : data type is new or has changed since the latest build, or is defined differently elsewhere
small_object_stl_allocator.h(200) : fatal error C1092: Edit and Continue does not support changes to data types; build required
small object stl allocator is a customer allocator we use with vector etc.
The next version of the compiler will indeed perform loop-interchange, where this would improve the code speed; and for some code patterns (for example, for "perfect" loop nests, where statements are confined to the body of the innermost loop). The original academic research into the various forms of kicking-and-punching loops around (formally, we call them "code transformations") so they improve memory access speeds, dates back to the early 90's.
Please note that the compiler won't give diagnostics on when this was done, or not. You would need to inspect the resulting assembly code. Once mixed with auto-vectorization, inferring all the optimizations carried out, by inspecting the assembly code, is not for the faint-hearted :-)
Why isn't this option on by default? Is it still experimental quality?
Even though it's on it's own completely standards complaint, when combined with /Gy potentially breaking behavior can result.
With /Gy, function addresses are not guaranteed to be unique. We hit one case internally where someone had worked around this by referencing a global variable inside otherwise identical functions - before /Gw, this was enough to inhibit comdat folding of the functions and get unique addresses. Now, /Gw can fold together those globals, thus allowing the now perfectly identical functions to be folded together, defeating this work around.
We really wanted this on by default (that's why it doesn't work on addressed variables, unlike Gy) - but hitting just this one case was sufficient for the "off by default" proponents to win. It had nothing to do with us feeling the feature was experimental or of lower quality.
@Terry Mahaffey, you hit a breaking case in internal code. Was it FCUnique in the CLR? I just stumbled upon it and its only purpose seems to be to make the method contents different.
I can't find /Gw anywhere in the Linker settings, is there no GUI for it yet? Are you just supposed to add it to the command line? (VS2013 RC)