Every so often, your application makes a call on a COM interface pointer, and this call fails with RPC_E_DISCONNECTED.  Or it fails with RPC_E_SERVER_DIED, or with RPC_E_SERVER_DIED_DNE, or any number of RPC errors.  In most cases what this means is that the server implementing this COM interface is no longer running.  Which error code you receive is dependent on many factors, such as whether the server process exited normally, whether the server process shutdown unexpectedly, whether the shutdown was in progress when the call came in, etc.  There are many reasons why this might happen.  The server process may have simply crashed.  Alternatively, it may have exited in response to someone stopping its service, killing it from task scheduler, or even because of a bug that caused it to exit early. If your application uses out-of-proc COM objects, it needs to realize that these errors might occur.  There are many ways that these errors can be handled, but as always the choice depends on your scenario.

One common pattern is for an application to maintain a list of pointers to COM objects, all of them implementing the same interface, for use throughout the application.  Your application might be an event publisher, pushing event notifications out to anyone who's subscribed.  Alternatively, these COM objects may all be providers of functionality used by your application.  In either case, your application is going to be littered with code that looks like this

list = GetInterfaces()
for each p in list
   HRESULT hr = p->SomeFunc()

   if (FAILED(hr))
      handle error

Now, there's a real problem with the above code.  If any of the COM servers has exited, you'll get one of the above-mentioned RPC errors.  In order to fix this, we need to check for these errors and handle the appropriately.  Let's say that we want to restart a COM server whenever it exits unexpectedly.  In order to do this, we need to store a CLSID with each interface pointer and we also need to handle RPC errors in the above code snippet;  for simplicity, we'll only care about the three RPC errors mentioned above.  Now, instead of storing a list of interface pointers, we'll store a list of the following structure

struct Entry
{
    CLSID m_id;
    IInterfacePointer* m_interface;
    Entry* m_next;
};

Every time we want to broadcast a function call to all entries in the list, we now need code that looks like this

Entry* current = GetInterfaces();
while (current)
{
    HRESULT hr = current->m_interface->SomeFunc();
    if (hr == RPC_E_DISCONNECTED || hr == RPC_E_SERVER_DIED || hr == RPC_E_SERVER_DIED_DNE)
        NotifyEntryDead(current);
    else if (FAILED(hr))
        // standard error handling

    current = current->m_next;
}

Where NotifyEntryDead looks something like this

void NotifyEntryDead(Entry* entry)
{
    RemoveFromList(entry);
       
    HRESULT hr = CoCreateInstance(entry->m_id, NULL, CLSTCTX_ALL, IID_IInterfacePointer, (void**)&entry->m_interface);
    if (FAILED(hr))
      // handle error
   
    AddToList(entry);   
}

Now, writing this loop each time we  broadcast a function call creates a lot of tedious, repetitious code.  A slightly more clever programmer would create proxy functions, one  for each member of the IInterfacePointer interface, which did this broadcasting for you.  However if the interface has many members, it's still annoying to have to add this NotifyEntryDead code to each of these broadcast functions.  What's more, if you're fixing existing code to handle this situation, there might not be any broadcasting functions;  in this case, you'll be forced to go add this RPC error checking to all of these loops scattered around your code.

Now what would be really nice is if the GetInterfaces() function did this checking for you.  Add a new member to IInterfacePointer called IsAlive, and document that it should always return S_OK.  GetInterfaces() will now walk over the list calling IsAlive() on all the pointers.  If IsAlive fails for some interface pointer, GetInterfaces() will call NotifyEntryDead for that entry.  There's still a window where your loops might hit these RPC errors, but the window is now small enough that it's acceptable to handle them as normal errors.

This is great if you have control over this IInterfacePointer interface.  Unfortunately, it's often the case that you can't control what members the interface has.  In order to check whether the interface pointer is valid, you need to be able to call some function on the interface which has no side effects.  And that's when you remember about IUnknown.

Every COM interface derives from a base interface named IUnkown.  IUnknown declares three members that all COM classes must implement:  AddRef, Release, and QueryInterface.  So, our first attempt at checking whether an interface pointer is valid looks as follows

bool IsValid(IUnknown* pointer)
{
    HRESULT hr = pointer->AddRef();
    if (SUCCEEDED(hr))
        hr = pointer-Release();

    return SUCCEEDED(hr);
}

Unfortunately, it turns out that the above function does not work.  To understand why, you must remember that the interface pointer isn't really a pointer to the out-of-proc COM class.  Rather it's a pointer to a local proxy class that implements the same interfaces as the out-of-proc class.  This proxy class is responsible for marshalling your function calls over to the process hosting the COM server and calling the real interface pointer in the context of that process.  It turns out that this proxy will only send one AddRef and one Release to the out-of-proc COM class.  Any AddRefs and Releases after the first AddRef are cached locally in the proxy, and a Release will not be sent to the out-of-proc class until the local reference count drops to zero.  Eventually COM will notice that the server is gone and these calls should start failing, however this might take some time.

Since AddRef and Release don't work, let's try using QueryInterface.  Here's our second attempt

bool IsValid(IUnknown* pointer)
{
    IUnknown* unknown = NULL;
    HRESULT hr = pointer->QueryInterface(IID_IUnknown, (void**)&unknown);
    if (SUCCEEDED(hr))
        unknown->Release();

    return SUCCEEDED(hr);
}

QueryInterface for IUnknown must always succeed for any valid COM object, so you would expect it only to fail if the COM object is no longer around.  Unfortunately, this suffers from the same problem that the AddRef solution had.  The COM proxy will cache interface pointers, and if you call QueryInterface with an interface ID that's already cached (and IUnknown will always be cached), QueryInterface succeeds without going beyond the bounds of the current process.  This knowledge leads us to our final attempt.  In order to guarantee that COM actually attempts to reach the actual COM object, we must call QueryInterface for an interface id that the proxy has never seen.  The simplest way to do this is to create a new unique identifier each time.  A proper COM object must return E_NOINTERFACE for such a request, so we write the following function

bool IsValid(IUnknown* pointer)
{
    IID nonExistent = GUID_NULL;
    HRESULT hr = ::CoCreateGuid(&nonExistent);
    if (FAILED(hr))
        // handle error
 
    IUnknown* unknown = NULL;
    HRESULT hr = pointer->QueryInterface(nonExistent, (void**)&unknown);
    assert(FAILED(hr));

    return hr == E_NOINTERFACE;
}

Our GetInterfaces() function will now walk over the entire list calling IsValid() on each interface pointer.  Any pointers for which IsValid fails will be restarted using NotifyEntryDead.