This problem falls into the category being hidden by a macro that does not indicate in its name what it touches. If you call IoMarkIrpPending on an IRP that you allocated in your driver, chances are that you are corrupting memory. First, let's look at the implementation of this function (from wdm.h):

#define IoMarkIrpPending( Irp ) ( \
    IoGetCurrentIrpStackLocation( (Irp) )->Control |= SL_PENDING_RETURNED )

This function retrieves the current stack location and sets a flag in one of the fields (IO_STACK_LOCATION::Control). Since the current IRP stack location is not valid, and by not valid I mean that the pointer value points to undefined memory, not that it is NULL, so the call corrupts memory (a nefarious single bit flip) somewhere in the system. It so happens that based on how IRPs are currently allocated today, you will corrupt memory immediately after the IRP allocation making this a little bit easier to diagnose later.

How would a mistaken call to IoMarkIrpPending even get into your driver? Well, most code is copied from somewhere else. A lot of completion routines have the following code in them. These completion routines were written to manipulate an I/O manager presented IRP (which works because there is a current stack location), but when copied over they have a time bomb waiting to go off in them. Here is an example completion routine which is copied:

NTSTATUS CompletionRoutine(PDEVICE_OBJECT DeviceObject, PIRP Irp, PVOID Context)
{
    ...
    if (Irp->PendingReturned) {
         IoMarkIrpPending(Irp);
    }
    ...
    IoFreeIrp(Irp);

    //
    // Must return STATUS_MORE_PROCESSING_REQUIRED because we cannot let the IRP complete
    // back to the I/O manager
    //
    return STATUS_MORE_PROCESSING_REQUIRED;
}

What happens if you have an I/O completion routine that handles both types (driver created, I/O manager presented) IRPs? In the I/O manager presented IRP case you need to propagate the pending state by calling IoMarkIrpPending, while in the driver create case you obviously should not propagate the state. KMDF must do this since it sets the same completion routine for all requests sent to a WDFIOTARGET. Here is how KMDF manages this functionality. If Irp->PendingReturned is TRUE and the current stack location count is less than or equal to the total stack count (in the driver allocated case, CurrentLocation will be > StackCount), propagate the pending into the current stack.

VOID
PropagatePendingReturned(
    PIRP Irp
    )
{
   if (Irp->PendingReturned && Irp->CurrentLocation <= Irp->StackCount) {
       IoMarkIrpPending(Irp);
   }
}

BTW, unaccredited part 1 is why you don't have a DeviceObject in your I/O completion routine