The new compiler error C4819

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

The new compiler error C4819

  • Comments 7

I was looking at Elyasse's Weblog and was reminded of one of the coolest feature entries in Whidbey.

I think I have been waiting roughly 112 versions of the Microsoft compilers for this. Well, probably not that many but it does feel like that....

New in Whidbey! From the help:

C4819 occurs when an ANSI source file is compiled on a system with a codepage that cannot represent all characters in the file.

To resolve C4819, save the file in Unicode format.

This is incredibly cool.... :-)

 

This post is sponsored by "©" (a.k.a. U+009, COPYRIGHT SIGN)

Comment on the blather
Leave a Comment
  • Please add 6 and 6 and type the answer here:
  • Post
Blog - Comment List
  • "C4819 occurs when a non-ANSI compliant compiler ignores clause 2.1 (Phases of Translation) in the C++ standard and does not map physical source characters to the basic source character set"
  • Answer to the question this post implied....
  • Ah, it generates this when it can't perform the mapping because it has encountered a nonsense byte? That's fine, but I still think "save the file as unicode" is duff advice considering all the legacy source control systems out there...

    I'm now curious. What did the compiler do prior to Whidbey? Assume a particular codepage?
  • Well, the advice is for *new* code, not legacy.

    Legacy systems would always save the file as ANSI and so the characters in question would be converted to question marks (same as in notepad when you try to save as ANSI).
  • Too much i18n does not seem good for the compiler.

    Michael, by all respect I cannot share your view on this "incredibly cool" feature. I think it is incredibly uncool.

    The bad thing about this warning  can result to an error like here

    http://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=341454

    is that : C strings,  the null terminated arrays of bytes,

    do not have any encoding information per se, i.e are supposed to be treated as opaque arrays of bytes. Now, I have a perfectly valid

    C file, containing ASCII-only, except for UTF8 bytes instrings (UTF8 for a good reason, I intend to edit this file in UTF-8 editor). And such a file will break with incomprehensible message on Whidbey on Japanese Windows now

    The connect bug is now resolved with Won't Fix, so I can not even hope that this will be fixed with the next version of the compiler.

    Alternatives for me?

    1)Documentation and support says  - add a BOM to the file. No way, then it will break on older compiler and on non-Microsoft compilers.

    2)#pragma setlocale?

    Does not work

    3) convert strings  to  their hex-byte-array array form

    something like

    char foo={0xba,0xad,0xf0,0x0d,0x00}?

    Will work, will look ugly and I'll have to forget about editing this file in a my wonderful UTF8 -capable editor , VS2005 IDE.

    Or forget about getting this file compiled on Japanese Windows. It is not important *for me* anyway.  This compiler works quite well on latin1 territories:)

  • Since the BOM does exist, you could also petition the other compilers to start recognizing it, too. I'm sorry, but I agree with Jonathan Caves on this issue -- use the BOM and you are golden.

  • Sometimes things work by accident. You know -- no one ever planned for it to work, no one tested it to

Page 1 of 1 (7 items)