Larry Osterman's WebLog

Confessions of an Old Fogey
Blog - Title

What's wrong with this code, part 4: the answers

What's wrong with this code, part 4: the answers

  • Comments 16

As I mentioned yesterday, this is a subtle problem.  Apparently it wasn’t subtle enough for the people commenting on the API, without fail, everyone nailed it perfectly.

But this IS a problem that I run into at least once a month.  Someone comes to my office and says “I’m getting an undeclared external in my application and I don’t see why!”  95% of the time, the problem is exactly the one in this bug.

Here’s the offending line in the buggy header:

NIFTY_API_API int MyNiftyAPI(void);

Not really surprising, heck, it’s the ONLY line in the buggy header.  What’s wrong with the line is actually an error of omission.  The header doesn’t specify the calling convention used for the API.  As a result, in the absence of an explicit calling convention, the compiler assumes that the current calling convention is the calling convention for the API.

Unfortunately that’s not the case.  The calling convention for an API is set by the compiler when the code is built, if every part of the project uses the same calling convention, you’re fine, but if anyone compiles their code with a calling convention other than yours, you’re toast.  Raymond goes into some detail on how to diagnose these problems here, so…  As I mentioned yesterday, he’s written a number of posts on the subject.

The key indicator that there might be a problem was my statement “I’m writing a DLL”.  If this was just a routine in my application, it wouldn’t matter, since all the components in my application are usually compiled with the same set of compiler settings.

But when you’re dealing with DLL’s (or statically linked libraries), the consumer of your code typically isn’t in the same project as you are, so it’s absolutely critical that you specify the calling convention you used to prevent them from using the “wrong” calling convention in their code.

Kudos:

Grant pointed out the problem initially, followed quickly (and more authoritatively) by Borsis Yanushpolsky.  Everyone else posting comments also picked it up.

Grant also pointed out (in private email), and Mike Dimmick pointed out in the comments section that there’s another, equally glaring problem with the header.  It is missing an extern “C” to correctly inform the compiler that the API in question shouldn’t use C++ name decoration.  The code should have been wrapped with:

#ifdef __cplusplus
extern "C" {            /* Assume C declarations for C++ */
#endif  /* __cplusplus */ 

#ifdef __cplusplus
}                       /* End of extern "C" { */
#endif  /* __cplusplus */
 

 So the full version of the header should be:

// The following ifdef block is the standard way of creating macros which make exporting
// from a DLL simpler. All files within this DLL are compiled with the NIFTY_API_EXPORTS
// symbol defined on the command line. this symbol should not be defined on any project
// that uses this DLL. This way any other project whose source files include this file see
// NIFTY_API_API functions as being imported from a DLL, whereas this DLL sees symbols
// defined with this macro as being exported.
#ifdef NIFTY_API_EXPORTS
#define NIFTY_API_API __declspec(dllexport)  
#else
#define NIFTY_API_API __declspec(dllimport)
#endif

#if defined(_STDCALL_SUPPORTED)
#define STDCALL __stdcall    // Declare our calling convention.
#else
#define STDCALL
#endif // STDCALL_SUPPORTED

#ifdef __cplusplus
extern "C" {     // Don’t use C++ decorations for our API names
#endif

NIFTY_API_API int STDCALL MyNiftyAPI(void);

#ifdef __cplusplus
}                // Close the extern C.
#endif

I have a huge preference for __stdcall APIs.  They have all of the benefits of the __cdecl calling convention (except for the ability to support variable numbers of arguments) and they result in significantly smaller code (since the routine cleans its stack, not the caller).   As I mentioned in a comment in the previous post, the savings that NT achieved when it switched its default calling convention from __cdecl to __stdcall was huge – far more than we had anticipated.

There’s still one more potential bug: The header file isn’t wrapped in either a #pragma once, or in #ifdef _NIFTY_API_INCLUDED_/#define _NIFTY_API_INCLUDED_/#endif // _NIFTY_API_INCLUDED_.  Given the current API header this isn’t a problem, but if the header file is included from multiple places, the possibility exists that definitions within the header could result in multiple definitions, which could later result in errors.

McGroarty brought up an interesting point:

I'm no Windows guy, but I'll put a cautious eye to the generic int being subject to the signedness and size of the day.

I hadn’t considered this originally, but he has a point.  Int is the only fundamental C/C++ type that has a variable size, it is guaranteed to be at least as large as a short (which is larger or equal to a char, but guaranteed to be able to hold values from -32767 to 32767).  So an int can be either a 16 bit, 32 bit or 64 bit quantity.

 

  • Don't forget #pragma once or its equivalents...
  • Wasn't that my "there's still one more potential bug:" comment?
  • Question about the data type size, I always understood the requirements as:
    sizeof(char) == 1
    sizeof(short) <= sizeof(int) <= sizeof(long)

    I wasn't aware that sizeof(short) had to be >= 2.
  • I actually checked before I wrote that :) According to: http://www.open-std.org/jtc1/sc22/wg14/www/docs/n869/n869.txt.gz
    in section 5.2.4.2.1:
    -- minimum value for an object of type short int
    SHRT_MIN -32767 // -(215-1)

    -- maximum value for an object of type short int
    SHRT_MAX +32767 // 215-1

  • Note that in C standard terms, sizeof(short) does _not_ have to be >= 2, because char may very well be larger than 8 bits. E.g. both could be 16 bits and sizeof(short) would be 1. That's why they're defined in terms of minimum values rather than relative sizes.

    In practical terms I doubt you'd see that on Windows, unless some of the Windows Embedded platforms have such an environment.
  • Windows XP Embedded uses the same binaries as Windows XP Professional, you just get to choose which ones are part of your platform. This means that, IIRC, regular Windows hotfixes also work on an XP Embedded system.

    Windows CE compiles some components from source and links others from supplied static libraries to build the final binaries and ROM image. It has little code shared with Windows XP or 9x, and desktop binaries won't run directly (except some CLR binaries which conform to pretty strict requirements). Nevertheless, to improve source-level portability, the rules for integer sizes are exactly the same: a char is 8 bits, a short 16 bits, and int and long are both 32 bits. __int64 is also supported. These rules are identical across all supported architectures (ARM, MIPS, Hitachi SHx, x86, and PowerPC - PPC is not supported after CE 3.0).

    This source-level compatibility also allows Microsoft to maintain one 32-bit x86 compiler, rather than two. eMbedded Visual C++ 4.0 SP3 ships with CL.EXE version 12.00.8804, the same version as Visual Studio 6 SP5.
  • Using _NIFTY_API_INCLUDED_ is AFAIK not correct in a user-defined library. Identifiers beginning with an underscore and an uppercase character are reserved for implementation by ANSI C. See e.g. http://www.oakroadsystems.com/tech/c-predef.htm#Groups
  • Why not use a .def file in the implementation dll to export MyNiftyAPI and avoid the "__declspec" mess? With a .def file you can control a few other things, like the ordinal.
  • noone: Because I wanted a self contained example.

    Also, the __declspec "mess" as you put it allows the compiler to generate more efficient code - instead of generating:

    call MyNiftyAPI_thunk


    MyNiftyAPI_thunk:
    jmp [__imp_MyNiftyAPI]

    the compiler can generate:

    call [__imp_MyNiftyAPI]

    which can result in a significant perf benefit (seriously).

    Also, ordinals have other issues, they work, but only in very limited scenarios.

  • Larry - thanks for the response. Great info.

    Do you prefer using __declspec over .def files in your code?

    Could you point me to some info on issues with ordinals?

    Thanks again!
  • I do prefer __declspec, but I use .def files as well. For C++ code, __declspec is effectively required (since the function decoration gets in the way).

    The good thing about ordinals is that they resolve faster than without ordinals - that reduces DLL snap time.

    The problem with ordinals occurs after you ship V1. Once you ship a DLL with its entrypoints mapped via ordinals you can never ever ever change the ordinals for your APIs. Even if you decide to remove an API (for whatever reason) you still need to keep the entrypoint to keep all the other ordinals the same.
  • Well, since your function takes no parameter, in practice it doesn't matter what calling conventions the caller and the DLL are using. So I guess the real bug is using a bad example. :)

    Unless you have a weird calling convention that specifies something like the callee leaves the return address on stack after returning, or passing return address in registers.
  • B.Y. That's a good point, but there are STILL differences between the calling conventions

    For example, in __cdecl, the name of the function as exported by the linker is _MyNiftyAPI. In __stdcall, it is MyNiftyAPI@0.

    If you're app is looking for _MyNiftyAPI, and the library contains MyNiftyAPI@0, you'll get an undocumented external (remember my comment at the beginning about people coming to my office asking why their code doesn't link).


  • You're right, I was only thinking of parameter passing. I tend to think name mangling as naming convention.

  • Semi-retired C language lawyer pops in now.

    In the 1989 C standard and as far as I can tell from the relevant sections of the 1999 C standard, there are a lot less requirements on sizeof the various integer types than everyone here seems to think. This was a famous bug in the 1989 standard, but it and other famous bugs in the 1989 standard were retained unchanged in the 1999 standard.

    There are minimim requirements of the form that short must be able to represent at least the range from -32767 to +32767, int must also do so, long's minimum requirement is somewhat larger, etc. There are requirements that every value representable in a short must also be representable in an int, every value that is representable in an int must also be representable in a long, etc.

    But except for the type unsigned char, all of the other integral types are allowed to have unused bits in their representations.

    For example, sizeof(short) can be 32 bytes, though the implementation might only use 48 of the bits to represent the value, with the other 208 bits unused.

    And sizeof(int) can be 16 bytes, using 52 bits to represent the value, with the other 76 bits unused. (Or alternatively using 48 bits to represent the value, same as for short.)

    And sizeof(long) can be 8 bytes, using 53 bits to represent the value (or alternatively at least as many as for int, and at least 32).

    Of course no implementor is as insane as this, but the standard permits this insanity and any programmer who cares about portability has to have a bit of awareness about it.

    int is always signed so that isn't an issue. Plain char might have the same representation as either signed char or unsigned char but in the language it's a different type from both. With plain char unlike int you do have to be aware that the range of values might or might not include negative values. sizeof(char) and sizeof(unsigned char) and sizeof(signed char) are always 1, but sizes of other types are up in the air.

    Even practically speaking, and even in days when memory was expensive (1 megabyte of core being the same size as a refrigerator), there existed machines where it would be sensible for sizeof(char), sizeof(short), sizeof(int), and sizeof(long) to all be 1. Usually implementors went to some effort to store multiple characters in each addressable word, but they didn't have to. Using all 36 bits for a char, same as for a long, was wasteful but not necessarily insane.
Page 1 of 2 (16 items) 12