Tom Hollander's blog

patterns, practices and pontification

Enterprise Library and the Curse of MAX_PATH

Enterprise Library and the Curse of MAX_PATH

Rate This
  • Comments 10

Once upon a time, in the kingdom of DOS, the people all used 8.3 filenames for their programs, documents and source files. The subjects got used to files like LETTER.DOC, MSCDEX.EXE and VBRUN200.DLL. Nobody ever knew whether there was any limit to the number of 8.3 names that could be strung together to make deep paths, since there were no good GUI tools or even tab-completion, so the paths needed to a manageable length to enable typing by hand. And besides, there were only so many files you can fit on a 20Mb hard disk.

During the reign of Windows 95, the wise ruler decided that the people deserved more, and they were granted long filenames. Now any filename could have as many as 256 characters. And the people rejoiced. They could finally create files with names like "Letter to Grandma.doc". However the more geeky members of the kingdom frowned on this new trend, and so system files and source code files tended to stick with the older 8.3 filenames.

But over time, things changed. The people started to get greedy. They started creating longer and longer filenames. They realized that they could create folders with long names and place those inside other folders with long names, while still navigating them quickly using Windows Explorer. Even the royal geeks got on board. Key system folders were given longer names, such as "Program Files", "Documents and Settings" and "My Documents". And the advent of .NET continued this trend, bringing such examples as System.Runtime.Serialization.Formatters.Soap.dll.

As it turned out, the Wicked Witch of the Kernel had foreseen this greed long ago, and had placed the Curse of MAX_PATH over the kingdom. Under this curse, any path containing more than 260 characters in total would result in random, mysterious failures, reminding the subjects that they should not take the gift of long filenames for granted. And this curse lives on to this very day.

OK, maybe that isn't quite the way it happened, but I'm not on the Windows team so I really don't have any idea how this restriction came to be or why it is still here. But, alas, MAX_PATH is real, and paths longer than that will fail in many situations, presumably based on how old the underlying API is. While the fairy tale above was just a little bit made up, I do want to tell you a true story of how the Curse of MAX_PATH has impacted us on the Enterprise Library team.

We've been running into MAX_PATH related errors on and off for some time with our codebase, and we know many of you have as well. The problem mainly manifests itself when compiling the source code inside Visual Studio or msbuild, although you can get it in other situations such as when installing from MSIs or unfolding GAX guidance package templates. At first we assumed that the issue, while unfortunate, was mitigated easily enough by avoiding installing the code into deep root paths. But over time the problem has just gotten worse, and even manifested itself with relatively modest root paths - so we decided we needed to take some action.

If you take a look at the Enterprise Library assemblies, you'll see some pretty long assembly names, such as Microsoft.Practices.EnterpriseLibrary.ExceptionHandling.Logging.Configuration.Design.dll. With the benefit of hindsight, we probably should have chosen something quite a bit shorter (although to our defense, the name is consistent with the published .NET Framework Design Guidelines). Indeed we have seriously considered shortening the assembly names (but leaving the namespaces the same) in EntLib 3.0, changing the above to something like EnterpriseLibrary.ExceptionHandling.Logging.Design.dll. However, to our surprise, we found out that a change like this across the board would still not significantly reduce the instance of MAX_PATH issues.

How could this be? Fernando, Peter and I spent quite a bit of time investigating the root cause of the issue, and we found that the long path names in Enterprise Library could be boiled down to four main causes:

  1. Long root paths for the code base, even when using the OS defaults (such as C:\Documents and Settings\Username\My Documents\Visual Studio 2005\Projects)
  2. Deep folder hierarchies within our solutions
  3. Long assembly names
  4. Temporary resource files created in obj\debug during compilation

Now looking at these causes, there isn't anything we can do about #1. It's worth noting that the default paths are now much shorter in Windows Vista (eg C:\Users\Username\Documents), however since many people still use XP this doesn't help a lot today.

For #2 we didn't think there was a lot we could or should do - while we do have a number of subfolders within our solutions, they help code organization and discoverability, and the folders all have sensible and relatively short names such as Instrumentation or Configuration.

As mentioned before, we did feel that the we could get some savings for #3, however with the exception of some new unit test assemblies (such as Microsoft.Practices.EnterpriseLibrary.Security.Cache.CachingStore.Configuration.Manageability.Tests.dll, which we've since shortened), changing assembly names wouldn't help since issue #4 was the one that was killing us.

So what is this issue? I've never seen this documented anywhere, but here's what we discovered. The C# compiler (no idea about the others) does a few wacky things with resource files during compilation. Resource files include embedded binaries (such as icons), string resource files and designer resources for controls and forms. Here's what the compiler appears to do for each resource:

  • Start with the default namespace for the current project (as specified in the Project Properties dialog)
  • For each filesystem folder beyond the project root where the resource is stored, append that to the default namespace
  • Append the resource file's actual name to the end
  • Stick it in the obj\debug folder

To give a practical example:

  • We have a project with a default namespace of Microsoft.Practices.EnterpriseLibrary.Security.Cache.CachingStore.Configuration.Design (the assembly name happens to be the same, although that is irrelevant)
  • That project contains a subfolder called Properties
  • In that folder is a file called Resources.resx
  • The compiler will use all of these facts to dynamically create a file called obj\debug\Microsoft.Practices.EnterpriseLibrary.Security.Cache.CachingStore.Configuration.Design.Properties.Resources.resources

Since this is based around default namespaces, rather than assembly names, shortening the assembly names alone will not get rid of these files. What you can do, of course, is change the projects' default namespace. This seems silly, since the purpose of this feature is to ensure that any new classes you generate use a sensible default namespace, and I don't see why this information should be used by the compiler at all - but apparently it does. In addition to the side effect of new code files being in the wrong namespace, changing the default namespace can also result in other breaks to the code, as the runtime namespace of the resources is also derived from the default namespace, so code that accesses resources needed to be changed.

Anyway, to cut a long story just a little bit less long, we found that by changing the default namespace of a few Design projects, fixing the code that accessed resources in those projects, and shortening the assembly names of a few test projects, we were able to cut the maximum path lengths for Enterprise Library by around 20 characters - without changing any of the core block assembly names. This won't guarantee that MAX_PATH issues will go away, but it will certainly make them less common.

All that remains now is to hope that the Wicked Witch of the Kernel sees the error of her ways, and that one day she will lift the Curse of MAX_PATH from us all.

  • The issue is actually not with CSC.exe, but with MsBuild and the way it generate the resources.

    You can avoid it by running it manually or modifying the build scripts.

  • I've encountered this issue before in some of the code I've written, especially code that relied on the Shell APIs. I subsequently changed the code to use the Unicode version of CreateFile (CreateFileW) or the CRT equivalent and prepended the Unicode escape prefix ("\\?\"), as per the CreateFile docs, and the problem goes away.

    MsBuild should generate scripts with escaped paths.

    If you encounter this in another tool, ask the vendor to fix it properly :-).

  • Hi, Tom,

    I don't know if you remember, but this raised its head when I was writing the IsolatedStorageBackingStore for the caching block. I wanted to have intelligent file  names for the subfiles used to store cache entries into IsoStore, but I immediately ran into problems. The IsoStore area is located in some top-secret directory beneath documents and settings\user_name\applicationdata\a_bunch_more_stuff, and I ended up with about 12 characters *I* could actually use :)

    Just reminiscing :)

    bab

  • Thanks Brian - I remember this well. This was even more painful in many ways since we had absolutely zero control over the base path for isolated storage, so the only option we has coming up with really short names for our files.

    Fun times :-)

    Tom

  • As one of my friends justly admitted, there couldn't be such name as "VBRUN200.DLL" in the kingdom of DOS... (since .DLL are The Windows Republic invention ;-) )

  • I also tested the \\?\ trick but from my (small) experiment I learned that the managed apis do not allow \\?\ as the first chars in a filename. And msbuild is fully managed code so that will be major change, but csc should be able to handle that in .Net 3.5  

    :-)

  • Then there you have the bug: managed APIs should allow the NT resource naming scheme.  Sounds like a good deployment blocker bug.

    Another situation is the following: drive letters.  Imagine the server with 26 disks,

    C-Z, plus 2 more.  What are you going to call these extra drives?  "AA:" and "BB:"?  Clearly the NT scheme needs to be accesible from the managed layer and useable by the .NET library routines.

  • The Kingdom of DOS might provide you with a solution, at least for compiling/building issues: map the root of your sourcebase to a drive letter with the subst command.

  • In keeping with the p&p team's tradition of naming a release after the month that's just finished,

Page 1 of 1 (10 items)