So – one of my teammates tells me that if he leaves MFCMAPI running overnight, it has usually crashed by the time he comes in the next morning. We got a few dumps, but they were inconclusive since the stacks were completely different every time. Since we didn’t know what was going on, we started paying attention to when and where the problem happened. We noted the following:
Armed with this information, I decided to look at the dumps again. The exception thrown in each was 0xC0000006, STATUS_IN_PAGE_ERROR. Ok – so that’s explains things – maybe some network hiccup causes us to lose the connection to the original executable, then the next time the process needs to load a page of memory it can’t find it?
For a while, I thought this was the end of the investigation – there’s no way MFCMAPI can control network hiccups, right? Then I stumbled across this linker option:
Basically, when this option is set, the OS will copy the whole binary image to a local swap file before running it. We set the “Swap Run From Network” switch on a test build, and he hasn’t been able to reproduce the crash again. If you’re in to manually editing your .vcproj file, this is the equivalent of adding the line
to the “VCLinkerTool” section of your project’s configuration.
The next version of MFCMAPI will have this option enabled for all configurations.
Thanks for the tip. Why would you want to enable it for everyone, though? Increasing the size of swap file comes at a cost - let your coworkers set the flag locally - it's open source! :)
It's not something you can control after your process starts, so I don't see a way I could configure the behavior on the fly. The only way for my coworkers to set the flag themselves is to build the tool themselves, implying that it's running locally, in which case the switch has no effect (the OS only does this copy when run over the net).
We hit the same problem a couple of years ago and had to implement the same fix. I was really hoping for an explanation but it's nice to know we implemented the correct fix.
The explanation really is what I gave - there was a network hiccup (his machine was especially prone to them) that severed the connection between his machine and mine. The connection can be restored, but the process that was running is now timebombed. The moment it tries to access a page of code that wasn't already in memory, it will fault. Setting the linker option forces the OS to pull the pages locally first so this hiccup can't affect it.
The March 2009 Release (build 220.127.116.112) is live: http://mfcmapi.codeplex.com . Note the new URL – the