Holy cow, I wrote a book!
The MB_ERR_INVALID_CHARS flag controls how the MultiByteToWideChar function treats invalid characters. Some people claim that the following sentences in the documentation are contradictory:
MB_ERR_INVALID_CHARS
MultiByteToWideChar
Actually, the three sentences are talking about different cases. The first two talk about what happens if you omit the flag; the third talks about what happens if you include the flag.
Since people seem to like tables, here's a description of the MB_ERR_INVALID_CHARS flag in tabular form:
Here's a sample program that illustrates the possibilities:
#include <windows.h> #include <ole2.h> #include <windowsx.h> #include <commctrl.h> #include <strsafe.h> #include <uxtheme.h> void MB2WCTest(DWORD flags) { WCHAR szOut[256]; int cch = MultiByteToWideChar(CP_UTF8, flags, "\xC0\x41\x42", 3, szOut, 256); printf("Called with flags %d\n", flags); printf("Return value is %d\n", cch); for (int i = 0; i < cch; i++) { printf("value[%d] = %d\n", i, szOut[i]); } printf("-----\n"); } int __cdecl main(int argc, char **argv) { MB2WCTest(0); MB2WCTest(MB_ERR_INVALID_CHARS); return 0; }
If you run this on Windows XP, you get
Called with flags 0 Return value is 2 Value[0] = 65 Value[1] = 66 ----- Called with flags 8 Return value is 0 -----
This demonstrates that passing the MB_ERR_INVALID_CHARS flag causes the function to fail, and omitting it causes the invalid character \xC0 to be dropped.
If you run this on Windows Vista, you get
Called with flags 0 Return value is 3 Value[0] = 65533 Value[1] = 65 Value[2] = 66 ----- Called with flags 8 Return value is 0 -----
This demonstrates again that passing the MB_ERR_INVALID_CHARS flag causes the function to fail, but this time, if you omit the flag, the invalid character \xC0 is converted to U+FFFD, which is REPLACEMENT CHARACTER. (Note that it does not appear to be documented precisely what happens to invalid characters, aside from the fact that they are not dropped. Perhaps code pages other than CP_UTF8 convert them to some other default character.)
CP_UTF8