Shrink my Program Database (PDB) file

Shrink my Program Database (PDB) file

  • Comments 8

Overview

PDB's (Program Database file), PDB stands for Program Database, a proprietary file format (developed by Microsoft) for storing debugging information about a program (or, commonly, binaries such as a DLL or EXE). PDB files commonly have a .pdb extension. A PDB file is typically created from source files during compilation although other variants exist (also created by the linker when /Z7 is used). It stores a list of all symbols in a module with their addresses and possibly the name of the file and the line on which the symbol was declared. This symbol information is not stored in the module itself, because it takes up a lot of space.

This blog in particular goes over a few ways to shrink your PDB size, so let's get started. For demonstrating the effectiveness of these ways I have used the popular BingMaps (bingmaps.dll) windows store application.
 

#1. The /OPT:REF and /OPT:ICF effect

Linker has a good view of all the modules that will be linked together, so linker is in a good position to optimize away unused global data and unreferenced functions. The linker however manipulates on a OBJ section level, so if the unreferenced data/functions are mixed with other data or functions in a section, linker won't be able to extract it out and remove it. In order to equip the linker to remove unused global data and functions, we need to put each global data or function in a separate section, and we call these sections "COMDATs". (The COMDAT construction is enabled by the /Gy and /Gw compiler flags). COMDATs and usage of these flags /OPT: REF and /OPT: ICF enable (here's how to do this) linker optimizations. /OPT:REF eliminates functions and data that are never referenced and /OPT:ICF performs identical COMDAT folding. The two together form a strong force and the result is a smaller binary and hence also a smaller PDB.

Please note, enabling linker optimizations today disables incremental linking however.

#2. The /d2Zi+ effect

The usage of cryptic but undocumented switch is common especially for debugging optimized code. In particular, it provides more debug information for locals and inline behaviour. The side-effect of using this flag for all scenarios however results in PDB size growth. The exact specifics of the size increase are application dependent.
 

#3. Compress the PDB using /PDBCompress

For clean link scenarios /PDBCOMPRESS instructs the linker to open the target PDB file in a mode that will lead to the operating system compressing the file content automatically as debug records are being written into the PDB file.  This will result in a smaller PDB.  This switch won't have any impact if the operating system's file system does not support compression, or the linker is asked to update an existing PDB file to which compression by OS' file system hasn't been applied.

Figure 1: Effect of /pdbcompress on BingMaps PDB

Please note, the impact of this compression can be observed by looking at the 'size on disk'. If looking in windows explorer, compressed PDB's will light up in blue. 
 

#4. Incremental Update to PDB's

During incremental linking, we don’t remove unreferenced type records (which is same as in full linking), also for public and global records, we don’t remove obsolete ones (which is for throughput purpose)..  Over extensive use which is numerous rebuild/relink iterations the size of the PDB grows. We recommend a clean link (build) when possible for reducing the size of PDB's.

Sum it all up

To conclude, attached below is the result of enabling the above techniques on the popular BingMaps Windows Store application.


As you can see there are some clear wins with the methodologies described. Please note the vanilla build setting here is an optimized (/O2) build with /Zi (Program Database) enabled.

Reach out to us if you have questions, concerns or feature requests w.r.t the linker and PDB's.

Additionally, if you would like us to blog about some other compiler technology or compiler optimization please let us know we are always interested in learning from your feedback. 

  • It is also worth mentioning that PDB files that only contain public symbols are much smaller. Stripping private symbols can be achieved with the /PDBSTRIPPED linker option or the PDBCopy.exe tool. This comes at the expense of making actual debugging more difficult, which may or may not be worth it depending on the scenario.

    Also, when checking symbols into a symbol store, it is a good idea to use the /compress flag on SymStore.exe. This causes a a compressed version of the PDB (equivalent to compressing with makecab or compress.exe) to be stored with the extension ".pd_". This is better for symbol servers than using the /PDBCompress linker option (which is equivalent to using FSCTL_SET_COMPRESSION) because the compressed version will be transferred over the network and decompressed by the client.

    Unfortunately, dbghelp.dll isn't smart enough to decompress PDB files that don't come from a symbol store so you really can't benefit from it without using a symbol store.

  • Why are  /OPT: REF and /OPT: ICF not enabled by default?

  • What's the motivation for making PDBs smaller?  Does it improve build times?  Is that worth the potential costs of having less information available during debugging?  Are the PDBs so big that developers aren't storing them?

  • @Adrian Yes, when you have a large application producing lots of PDBs and a build system that spits out lots of builds every day, archiving those PDBs in the symbol server can be costly.

    I think we use CAB compression on the PDBs. When they are accessed from the symbol server, windbg and visual studio are able to uncompress them. This saves both on bandwidth/transfer times, but also storage space.

  • @Sean, what you are saying is 100% true. /PDBSTRIPPED does make the PDB smaller but as you also mention the debugging experience is effected. Thanks for sharing your experience.

    @Olaf, when using Visual Studio 2013 RTM the OOTB Release build configuration is set with linker optimizations (REF and ICF) enabled. The same is not true for Debug build configuration primarily because it breaks incremental linking.

    @Adam, the key motivation behind making PDBs smaller is the network/resource cost associated with archiving them.  Copying them around different machines by individual developers and distributed build systems.  Additionally for some larger applications such as chromium we have seen PDB's close to a GIG and for some internal titles close to 2Gigs. With the release of VS2013 RTM we have seen a few scenarios where customers have complained and these suggestions are mainly meant for them.

    Now having said that, smaller PDB's could most likely mean smaller link times. One of the major bottlenecks in the linker today are today with massaging of debug information (merging of types etc.). Smaller PDB's would definitely mean linker spending less time in doing so and providing faster link times.  

  • In Windows 8+, there is nothing called "Windows Explorer". It has been renamed to "File Explorer".

    My two bits.

  • I don't care as much for the space, but the time it takes to link. A link with our 100 project sln may take 6 minutes or more to complete (without doing any code generation). Some of these pdb's files are very large; one is over 150 MB.

  • The link covering /d2Zi+ is accidentally a mailto link.

    Is there a known future where it remains cryptic and become documented/supported, or possibly just replaces the meaning of /Zi? And did you compare /d2Zi+ plus optimizations? It's interesting that your experience is so different from his, unless 5% counts as noise. Though, as it turns out, looking at one of our PDBs before and after enabling the switch (with optimizations enabled) shows a little over 10% increase in size. So I suppose maybe that's a good reason for it not replacing Zi.

Page 1 of 1 (8 items)