A sneaky error

The following post summarizes my personal experience with a subtle COM error. CO_E_SERVER_EXEC_FAILURE (0x80080005) is an error which can be randomly returned by CoCreateInstance() if you are using an out-of-process COM server. As far as I know, there are many COM clients are not aware on how to deal with it. The issue is particularly worrisome if you write for example VBS scripts that create, say, WMI objects/queries under the cover. What's worse is that this error tends to appear during stress, so you won't really see it in normal usage scenarios.

Most of the time, this error says that the COM server process failed to start in time and register the proper class factory for this specific COM class. We will see in a second what can cause this, but the idea is that under heavy CPU load the machine can be pretty slow, and some out-of-process COM servers will fail to initialize properly. That said, there is a limited number of cases when this error is caused by bad configuration or even a bug in the server code.

How does this error look like?

In winerror.h, this error is defined as CO_E_SERVER_EXEC_FAILURE as follows:

//
// MessageId: CO_E_SERVER_EXEC_FAILURE
//
// MessageText:
//
//  Server execution failed
//
#define CO_E_SERVER_EXEC_FAILURE         _HRESULT_TYPEDEF_(0x80080005L)

Also, if you use a VBScript code, the scripting engine translates this error into this exception:

0x80080005 (CO_E_SERVER_EXEC_FAILURE)   --> 429 (CantCreateObject)

What's going on under the cover?

I'll try to explain the instance creation process very briefly. Let's say that you attempt to create a COM object that is implemented in an out-of-process COM server. For example if you use something like CLSCTX_ALL for a class implemented in a COM server in a Windows service, the CoCreateInstance() implementation will first try to contact RPCSS.EXE on the target machine to figure out if there is already a process that implements this COM class. If there is one, RPCSS will redirect the client to that process, and the client will obtain in the end a pointer to the class factory proxy. This class factory will be used to get the COM instance object (and eventually cached for further use from that client process).

But if there is no COM server running process, then RPCSS (or the client) will attempt to start it. RPCSS will simply start this service (assuming that it's stopped) and wait a certain amount of time for this service to register its class factory for the COM class in discussion. As far as I know, the timeout is 120 seconds for Windows 2000/XP/2003. Normally, immediately during (or after) the service starts, during the initialization process, the service has to call the CoRegisterClassObject(...) Win32 API to register the COM class factory. The equivalent ATL call is _Module.RegisterClassObjects() and, under the cover, does call CoRegisterClassObject(...) for the COM classes that were specified in the BEGIN_OBJECT_MAP/END_OBJECT_MAP macro block. As soon as CoRegisterClassObjects() is invoked at the COM server side, the client will receive the pointer to the class factory.

Now, here is the catch. If the client fails to call CoRegisterClassObject() from the moment the process was started, or fails to call CoRegisterClassObjects() at all for the given class factory, then the client will receive the CO_E_SERVER_EXEC_FAILURE error in CoCreateInstance(...). This can happen for a variety of reasons:
1) The machine has a high CPU load and the process takes a long time to start and execute the CoRegisterClassObjects() in less than 120 seconds.
2) The COM server doesn't register for the right class IDs.
3) The COM server is currently stopping and there is a race condition between CoCreateInstance and the COM server stopping part.
4) There is a security problem in the way the COM server is started (this page seems to suggest misspelled passwords or lacking the "Login as Batch Job" privilege for "Run As.." COM servers, but anyway I would suggest re-verifying this information for your specific configuration)

How to workaround it?

Assuming that the real problem is (1) above (i.e. high CPU load) then a simple workaround would be to retry the CoCreateInstance(...) a very few times. You might ensure a certain timeout between the CoCreateInstance calls such that you will be sure that the server gets a chance to start. However, this workaround is only a partial solution, and you have to deal with a definitive failure anyway. Assuming that some other problem caused the CO_E_SERVER_EXEC_FAILURE, you don't want to retry for a long time, otherwise the user experience will be pretty bad (i.e. the user waits 10 minutes for the script to finish, and the script fails at the end with the CantCreateObject error anyway).

Conclusion

Any COM server that takes a while to start might cause this error in CoCreateInstance calls at the client side. You probably won't see this when you deal with a a COM server that stars very fast, or on in-process COM server.

Anway, this page contains my personal experience with this COM error. Be aware that things might change over various Windows versions, so don't take my word on this as an absolute document. Please re-check these facts before deploying any code in the production based on these assumed behaviors.

And, most importantly, if you have a serious error in this area, then your best bet is contacting Microsoft PSS - they are very well equipped to diagnose these types of errors in the field.