<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx</link><description>Occasionally, somebody fails to pass.</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541295</link><pubDate>Wed, 01 Mar 2006 18:29:01 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541295</guid><dc:creator>Moi</dc:creator><description>He should call SHGetSpecialFolderPath with CSIDL_APPDATA as the folder parameter. Why? Because in different language installations the directory is called something different. Even when you think &amp;quot;ah, it will only be used in Spain, inside this very company&amp;quot; there will be someone who has their computer installed with English language Windows or something and your program may well not work correctly.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541301</link><pubDate>Wed, 01 Mar 2006 18:39:44 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541301</guid><dc:creator>Serge Wautier</dc:creator><description>The conversion succeeded: oacute (U+00F3) translates to 0xA2 in code page 850 (OEM Latin 1).&lt;br&gt;&lt;br&gt;bonus 1: The problem is that he uses the wrong wode page : CP_OEMCP instead of CP_ACP.&lt;br&gt;&lt;br&gt;bonus 2: When he passes the string to a Windows ANSI API, Windows converts back to Unicode using the current ANSI codepage (default = 1252 on a Spanish box) -&amp;gt; 0xA2 in codepage 1252 = cent character, as your display shows.&lt;br&gt;&lt;br&gt;If he wants to pass the string to a Windows API, he should use CP_ACP, which will convert to code page</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541306</link><pubDate>Wed, 01 Mar 2006 18:44:45 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541306</guid><dc:creator>Jonathan Perret</dc:creator><description>The trick is in the phrase &amp;quot;it returns the multibyte string below&amp;quot; : it does not make sense to talk about a &amp;quot;multibyte string&amp;quot; without mentioning in what codepage it is encoded.&lt;br&gt;&lt;br&gt;What your customer probably meant to say &amp;quot;when I pass the output of WideCharToMultiByte to MessageBoxA(), here's what I see on my screen&amp;quot;.&lt;br&gt;The fact that it does not match what they see when calling MessageBoxW() with the original Unicode string is caused by the fact that MessageBoxA expects ANSI-encoded text, but this particular call to WideCharToMultiByte produces OEM-encoded text.&lt;br&gt;&lt;br&gt;So to answer the questions :&lt;br&gt;(0) pbUsedDefault is not set because the conversion went fine&lt;br&gt;(1) they should not change their conversion code, it what they want really is an OEM string !&lt;br&gt;(2) they got confused because they did not keep their eye on the code page :-)&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541318</link><pubDate>Wed, 01 Mar 2006 19:02:37 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541318</guid><dc:creator>Michael Jung</dc:creator><description>Here's my theory:&lt;br&gt;&lt;br&gt;Windows multibyte APIs (the ones with an 'A' suffix) assume the system's ANSI CP (CP_ACP) to be used for strings. The given code converts the unicode encoded path to CP_OEMCP, the system's OEM codepage. OEM codepages are those used by DOS and with the FAT filesystem (I guess NTFS is unicode?).&lt;br&gt;&lt;br&gt;(0) pbUsedDefault is not set, because there is an &amp;quot;'o&amp;quot; (sorry, german keyboard. nodeadkeys) character in the OEM codepage and it was correctly converted to it. However, I guess you are using an ANSI API to output the converted string.&lt;br&gt;(1) The customer should pass CP_ACP to WideCharToMultiByte,&lt;br&gt;(2) because then he'll get an ANSI string, which is the correct encoding for Win32 multibyte APIs.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541320</link><pubDate>Wed, 01 Mar 2006 19:07:44 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541320</guid><dc:creator>Nick Lamb</dc:creator><description>Answer: pbUsedDefault isn't set because no default characters were used, every character in the input string was available in the target encoding.&lt;br&gt;&lt;br&gt;1a. They /should/ switch to writing Unicode software, use UTF-8 and probably abandon Win32.&lt;br&gt;&lt;br&gt;1b. However they're more likely to change CP_OEMCP to CP_ACP and recompile.&lt;br&gt;&lt;br&gt;2a. It's 2006 already. Who wants to still debug code page problems a decade after they became irrelevant?&lt;br&gt;&lt;br&gt;2b. They've told the hopelessly overloaded WideCharToMultiByte function to convert to an OEM character encodings, in this case probably OEM 850 or OEM 858. But their actual display encoding is probably ANSI 1252, so the resulting string is &amp;quot;correct&amp;quot; but it's useless and appears wrong. CP_ACP tells WideCharToMultiByte to use the ANSI codepage. This part of the function is not too buggy so it will probably work.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541326</link><pubDate>Wed, 01 Mar 2006 19:19:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541326</guid><dc:creator>Eber Irigoyen</dc:creator><description>&amp;quot;D:\Documents and Settings\ABC\Configuraci&amp;#243;n local\&amp;quot;&lt;br&gt;&lt;br&gt;looks more like an spanglish installation =oP</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541327</link><pubDate>Wed, 01 Mar 2006 19:20:04 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541327</guid><dc:creator>Centaur</dc:creator><description>I hope that monitor instruction doesn’t have a section in Russian…</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541339</link><pubDate>Wed, 01 Mar 2006 19:33:10 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541339</guid><dc:creator>Serge Wautier</dc:creator><description>[I don't want to look like I'm feeding a troll. But...]&lt;br&gt;&lt;br&gt;Nick,&lt;br&gt;&lt;br&gt;2a: If nobody wants to debug such problems, why did you debug this one ? ;-)&lt;br&gt;Now, some people don't really have a choice. Believe it or not, not everyone writes software that talks to Windows only. I've been said that even in 2006, there are still quite a few devices out there which are not Unicode aware (and will not be in the next decade). And there are people who write Windows software who need to talk to such devices (I'm in that crowd). And if you ask them, they will tell you: OEM code pages issues are everything but irrelevant.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541341</link><pubDate>Wed, 01 Mar 2006 19:34:58 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541341</guid><dc:creator>Michael S. Kaplan</dc:creator><description>It looks like the flag(s) to actually do something with those default character parameters is not passed? :-)</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541359</link><pubDate>Wed, 01 Mar 2006 19:58:06 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541359</guid><dc:creator>Serge Wautier</dc:creator><description>Michael,&lt;br&gt;&lt;br&gt;The docs are not very clear about this: &lt;br&gt;&lt;br&gt;&amp;lt;quote&amp;gt;&lt;br&gt;lpUsedDefaultChar:&lt;br&gt;Points to a flag that indicates whether a default character was used. The flag is set to TRUE if one or more wide characters in the source string cannot be represented in the specified code page.&lt;br&gt;&amp;lt;/quote&amp;gt;&lt;br&gt;&lt;br&gt;These 2 sentences contradict each other if there are chars that can't be translated but WC_DEFAULTCHAR is not specified.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541365</link><pubDate>Wed, 01 Mar 2006 20:12:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541365</guid><dc:creator>Nick Lamb</dc:creator><description>Serge:&lt;br&gt;&lt;br&gt;I didn't debug this code, I just wrote out the answers to Raymond's questions. The person quoted by Raymond is debugging, and most likely they shouldn't be, because as we've seen they're trying to use this string with the so-called ANSI Windows APIs.&lt;br&gt;&lt;br&gt;I'm well aware that most software doesn't talk to Windows. I've never taken a job where I wrote Windows software, and on every occasion that I've had to program for Win32 (e.g. to help a friend) I've found the experience unpleasant and not to be recommended. - to forestall your most likely next question, Raymond's articles are generally interesting regardless of whether I like Windows.&lt;br&gt;&lt;br&gt;Michael:&lt;br&gt;&lt;br&gt;When lpDefaultChar is NULL and lpUsedDefaultChar is not, the system default character is used. The WC_NO_BEST_FIT_CHARS flag is passed in the example.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541376</link><pubDate>Wed, 01 Mar 2006 20:28:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541376</guid><dc:creator>Neil</dc:creator><description>WriteConsoleA would have printed the string correctly.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541400</link><pubDate>Wed, 01 Mar 2006 20:57:35 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541400</guid><dc:creator>dmz</dc:creator><description>&amp;gt; I just hope their Polish customers can figure out what the text is supposed to say.&lt;br&gt;&lt;br&gt;Well, speaking for myself, I got used to this &amp;#179;&amp;lt;-&amp;gt;ł &amp;#185;&amp;lt;-&amp;gt;ą mixup and my internal parser makes the &amp;quot;best match&amp;quot; automatically. The problem is:&lt;br&gt;1) It is slow (&amp;quot;Detail&amp;quot; should be &amp;quot;Szczeg&amp;#243;ł&amp;quot;, not &amp;quot;Szczeg&amp;#243;&amp;#179;&amp;quot;, although this scores 7/8, so is quite easy)&lt;br&gt;2) It shouldn't be used at all -&amp;gt; this kinda mixup only shows me that I don't want to buy anything from this company, but...&lt;br&gt;3) Cmpanies doesn't really care (which is wrong) and we are used to this (e.g. nobody protests -&amp;gt; and *that* is the mistake)</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541441</link><pubDate>Wed, 01 Mar 2006 21:47:48 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541441</guid><dc:creator>dave</dc:creator><description>8-bit characters? &amp;nbsp;How quaint.</description></item><item><title>Speaking of encodings</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541546</link><pubDate>Thu, 02 Mar 2006 00:19:07 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541546</guid><dc:creator>roxfan</dc:creator><description>Reminds me of this story:&lt;br&gt;&lt;a rel="nofollow" target="_new" href="http://community.livejournal.com/velik_moguch/242083.html"&gt;http://community.livejournal.com/velik_moguch/242083.html&lt;/a&gt;&lt;br&gt;Short summary: a Russian girl asked her French friend to send her a book and wrote the address in an email. The French, not in the least surprised that Russian uses mostly accented Latin vowels, carefully wrote it down on the envelope.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541592</link><pubDate>Thu, 02 Mar 2006 01:33:03 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541592</guid><dc:creator>Maciej Rutkowski</dc:creator><description>A few wrong characters because of a bad codepage is not even merely bad compared to what I'm sometimes seeing in Polish translations of hardware manuals. ;)&lt;br&gt;[Not that many of them are worth reading anyway]</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541748</link><pubDate>Thu, 02 Mar 2006 04:41:12 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541748</guid><dc:creator>Dean Harding</dc:creator><description>Nick: You're right in that they should be writing Unicode software, but why abandon Win32? Those are two totally orthogonal suggestions, and the second has nothing to do with this discussion.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541805</link><pubDate>Thu, 02 Mar 2006 07:11:42 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541805</guid><dc:creator>asdf</dc:creator><description>Is there a technical reason for why there is no (SetACP) function in the winapi to set the active codepage on a per process (or thread) basis?</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541829</link><pubDate>Thu, 02 Mar 2006 08:03:08 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541829</guid><dc:creator>Dean Harding</dc:creator><description>asdf: You can sort of do it with SetThreadLocale, which actually changes CP_THREAD_ACP. However, it's pretty ugly.&lt;br&gt;&lt;br&gt;CP_ACP is the *system* code page for a reason: changing the code page affects a lot more than just what WideCharToMultiByte would do. It also affects things like resource loading (and then what happens if you load a resource, change the thread locale, then load another resource? They don't match anymore! And it's even weirder with the way Windows caches various resources [like dialogs]).&lt;br&gt;&lt;br&gt;Besides, CP_ACP is based off a user's preferences - they've said &amp;quot;I understand Polish, and I like my interface to be in Polish, please.&amp;quot; You shouldn't go around trumping their preference.</description></item><item><title>OEM Codepage Fun</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#541933</link><pubDate>Thu, 02 Mar 2006 11:20:17 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541933</guid><dc:creator>Gideon</dc:creator><description>The fun starts when you are writing a console-mode application.&lt;br&gt;&lt;br&gt;- You need to call SetFileApisToOem() for the file APIs.&lt;br&gt;&lt;br&gt;- You need to call setlocale(LC_ALL, &amp;quot;.OCP&amp;quot;) to set the OEM codepage for the locale functions.&lt;br&gt;&lt;br&gt;- You need to call _setmbcp(_MB_CP_LOCALE) to adjust the multi-byte string functions (_mbschr, etc) for the OEM code page.&lt;br&gt;&lt;br&gt;- Then you need to work around the bug that mangles the command line argv arguments. &amp;nbsp;(CharToAnsi is wrongly called on them.)&lt;br&gt;&lt;br&gt;I wrote a standard library routine that does all the above for my console apps: OemCodePageHell().&lt;br&gt;&lt;br&gt;Microsoft wrote their own workaround for their console apps and put it in ULIB.DLL.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#542025</link><pubDate>Thu, 02 Mar 2006 15:32:38 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:542025</guid><dc:creator>Ben Bryant</dc:creator><description>Serge Wautier won.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#542105</link><pubDate>Thu, 02 Mar 2006 17:47:45 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:542105</guid><dc:creator>8</dc:creator><description>Gideon, is your library open sourced? If so, where can I find it?&lt;br&gt;&lt;br&gt;I only knew about SetFileApisToOem and SetFileApisToAnsi, and didn't know setlocale is in msvcrt.&lt;br&gt;&lt;br&gt;Can't find much info on ulib atm. I'll look into it if I find anything.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#542272</link><pubDate>Thu, 02 Mar 2006 21:12:31 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:542272</guid><dc:creator>Gideon</dc:creator><description>The library is closed source (sorry). &amp;nbsp;However I implemented the same OEM code page tricks for my Windows 'ls' console utility, &lt;a rel="nofollow" target="_new" href="http://utools.com/msls.htm"&gt;http://utools.com/msls.htm&lt;/a&gt;. &amp;nbsp;It is open source under the GNU GPL license.</description></item><item><title>re: Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2006/03/01/541266.aspx#543666</link><pubDate>Sat, 04 Mar 2006 19:43:44 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:543666</guid><dc:creator>8</dc:creator><description>Thank you very much!</description></item></channel></rss>