Ever seen this run-time failure? - "The value of ESP was not properly saved across a function call. This is usually a result of calling a function declared with one calling convention with a function pointer declared with a different calling convention"

Whenever I get sloppy, I tend to write awful code as shown below -

FARPROC pDllFunc = NULL;

HRESULT (* pDllGetReqRuntimeVer) (LPWSTR, LPWSTR, DWORD, DWORD*);

HMODULE hModuleMscoree = LoadLibrary(L"mscoree.dll");

 

if(NULL != hModuleMscoree)

{

    pDllFunc = GetProcAddress(hModuleMscoree, "GetRequestedRuntimeVersion");

    if(NULL != pDllFunc)

    {

        pDllGetReqRuntimeVer = (HRESULT(*)(LPWSTR, LPWSTR, DWORD, DWORD*))pDllFunc;

        hr = pDllGetReqRuntimeVer(

                    thrParam->cmdLineSw.pwszImage,

                    wszDebuggeeVer,

                    sizeof(wszDebuggeeVer)/sizeof(wszDebuggeeVer[0]),

                    & dws);

         :

        :

        :

    }

}

When sloppy, I forget to use the correct calling convention while declaring function pointers. Then I run into the above run-time error while calling into various functions exported by dlls (mscoree!GetRequestedRuntimeVersion in above example). Most Win32 APIs are declared as __stdcall which expects the "callee" to cleanup the stack. I compile with /Gd (uses __cdecl as default convention). Because I forgot of mark the function pointer with __stdcall, the run-time is now expecting the "caller" to cleanup the stack. Fortunately I'm bailed out by the above run time check. To correct this the, function pointer must be declared and used as follows -

FARPROC pDllFunc = NULL;

HRESULT (__stdcall * pDllGetReqRuntimeVer) (LPWSTR, LPWSTR, DWORD, DWORD*);

HMODULE hModuleMscoree = LoadLibrary(L"mscoree.dll");

 

if(NULL != hModuleMscoree)

{

    pDllFunc = GetProcAddress(hModuleMscoree, "GetRequestedRuntimeVersion");

    if(NULL != pDllFunc)

    {

        pDllGetReqRuntimeVer = (HRESULT(__stdcall *)(LPWSTR, LPWSTR, DWORD, DWORD*))pDllFunc;

        hr = pDllGetReqRuntimeVer(

                    thrParam->cmdLineSw.pwszImage,

                    wszDebuggeeVer,

                    sizeof(wszDebuggeeVer)/sizeof(wszDebuggeeVer[0]),

                    & dws);

        :

        :

        :

    }

}

 

Given below are the native disassemblies (function prolog and epilog) when you use __cdecl and __stdcall respectively. As you can see, for the function marked as __cdecl, the "caller" is cleaning up the stack. While the "callee" does the needful when __stdcall is used.

When using __cdecl 

00412DB1  mov         esi,esp

00412DB3  lea         eax,[dws]

00412DB9  push        eax 

00412DBA  push        80h 

00412DBF  lea         ecx,[ebp-168h]

00412DC5  push        ecx 

00412DC6  mov         edx,dword ptr [ebp-48h]

00412DC9  mov         eax,dword ptr [edx+10h]

00412DCC  push        eax 

00412DCD  call        dword ptr [ebp-61Ch]

00412DD3  add         esp,10h  ß----------- stack is cleaned by caller!

00412DD6  cmp         esi,esp

00412DD8  call        @ILT+840(__RTC_CheckEsp) (41134Dh) ß------ The run time check

00412DDD  mov         dword ptr [ebp-18h],eax

 

When using __stdcall 

00412DB1  mov         esi,esp

00412DB3  lea         eax,[dws]

00412DB9  push        eax 

00412DBA  push        80h 

00412DBF  lea         ecx,[ebp-168h]

00412DC5  push        ecx 

00412DC6  mov         edx,dword ptr [ebp-48h]

00412DC9  mov         eax,dword ptr [edx+10h]

00412DCC  push        eax 

00412DCD  call        dword ptr [ebp-61Ch]

00412DD3  cmp         esi,esp ß--------- At this stage callee has already cleaned up stack

00412DD5  call        @ILT+840(__RTC_CheckEsp) (41134Dh) ß------ The run time check

00412DDA  mov         dword ptr [ebp-18h],eax

 

I should be paying more attention to the calling conventions and instead of relying on this run-time check. But it is nice to have this safety net.