<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx</link><description>There are two 8-bit code pages in common use in Windows. Make sure you know the difference.</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389534</link><pubDate>Tue, 08 Mar 2005 15:19:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389534</guid><dc:creator>Dave</dc:creator><description>I presume the &amp;quot;^&amp;quot; character in the command-line example above is a typo...&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389536</link><pubDate>Tue, 08 Mar 2005 15:21:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389536</guid><dc:creator>^</dc:creator><description>cmd /U /C dir ^&amp;gt;files.txt&lt;br&gt;&lt;br&gt;Why do you use a ^ to make the &amp;gt; literal?&lt;br&gt;The result is the same (in this case).</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389538</link><pubDate>Tue, 08 Mar 2005 15:24:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389538</guid><dc:creator>Ben Hutchings</dc:creator><description>Would I be right in guessing that &amp;quot;OEM&amp;quot; refers to the fact that the character encodings used by DOS were effectively chosen by the PC manufacturer (OEM) and burnt into the video card ROM?&lt;br&gt; &lt;br&gt;&amp;quot;The command processor has an option (/U) to generate all piped and redirected output in Unicode rather than the OEM code page.&amp;quot;&lt;br&gt;&lt;br&gt;How does this work? Does the command interpreter pipe the output through itself?</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389540</link><pubDate>Tue, 08 Mar 2005 15:27:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389540</guid><dc:creator>scritch</dc:creator><description>What are the characters that are in Canadian French and not in European French  ?</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389543</link><pubDate>Tue, 08 Mar 2005 15:29:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389543</guid><dc:creator>Obviator</dc:creator><description>So then why didn't we get two consoles? One for running MS-DOS programs, one for doing real work (compiling etc). To avoid confusion, the former could be restricted to running 16-bit DOS executables, and the latter to 32-bit Win command line programs. (But that isn't a necessity.)&lt;br&gt;&lt;br&gt;Having said this, I would bet that a decent implementation of console for Windows is somewhere to be found on the 'net, using the ANSI code page and everything.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389547</link><pubDate>Tue, 08 Mar 2005 15:38:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389547</guid><dc:creator>Mike R</dc:creator><description>There are 2: The 32-bit cmd.exe, and the 16-bit command.com.. :)&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389548</link><pubDate>Tue, 08 Mar 2005 15:39:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389548</guid><dc:creator>Kim Gräsman</dc:creator><description>As a Swede with a dotted-character name, this really messes up my console apps :-)&lt;br&gt;&lt;br&gt;Shouldn't I be able to use the CP command, or similar, to set the ANSI codepage on my command prompt to get rid of the discrepancy?&lt;br&gt;&lt;br&gt;Can anyone provide details?&lt;br&gt;&lt;br&gt;Thanks,&lt;br&gt;- Kim</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389551</link><pubDate>Tue, 08 Mar 2005 15:41:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389551</guid><dc:creator>Michael Kaplan</dc:creator><description>The funniest thing about the OEM code pages is who the original OEM was -- it was IBM. Ours were mostly cooler and definitely covered more languages, though....&lt;br&gt;&lt;br&gt;More info on the ACP/OEMCP split at &lt;a target="_new" href="http://blogs.msdn.com/michkap/archive/2005/02/08/369197.aspx"&gt;http://blogs.msdn.com/michkap/archive/2005/02/08/369197.aspx&lt;/a&gt; and an antecdote from Helen Custer about where the ANSI term came from here at &lt;a target="_new" href="http://blogs.msdn.com/michkap/archive/2005/03/01/382289.aspx"&gt;http://blogs.msdn.com/michkap/archive/2005/03/01/382289.aspx&lt;/a&gt; . :-)</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389584</link><pubDate>Tue, 08 Mar 2005 16:20:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389584</guid><dc:creator>AC</dc:creator><description>Canadian French puts accents on capital letters. French French doesn't, which leads to ambiguity in all-caps newspaper headlines (peche vs p&amp;#234;che vs p&amp;#233;ch&amp;#233;) but looks more beautiful on the printed page.&lt;br&gt;I think IBM just forgot about the &amp;#216; character. Their people were certainly embarrassed enough about the omission at the time!</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389586</link><pubDate>Tue, 08 Mar 2005 16:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389586</guid><dc:creator>Chris Lundie</dc:creator><description>Don't know if this is the right answer, but Canadian French uses upper-case vowels with accents, which is kind of unique.&lt;br&gt;&lt;br&gt;Also I was surprised to learn that in French, you are supposed to put a non-break space in front of a colon. I always wondered why that character existed.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389602</link><pubDate>Tue, 08 Mar 2005 16:35:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389602</guid><dc:creator>Ben Hutchings</dc:creator><description>Michael: I heard that MS had quite a bit of input into the design of cp437 though. IBM and MS cooperated very closely until MS decided to focus on Windows instead of OS/2.&lt;br&gt;&lt;br&gt;Obviator, Kim: Use the chcp command to change the code page used by the command interpreter and its console. You probably want code page 1252.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389617</link><pubDate>Tue, 08 Mar 2005 17:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389617</guid><dc:creator>Jonathan Perret</dc:creator><description>That French capitals should not be accented, in France or elsewhere, is an unfortunately very common misconception, relayed by many a clueless schoolteacher.&lt;br&gt;&lt;br&gt;I can only point interested people over there (French only, though I doubt this is a problem for the aforementioned) :&lt;br&gt;&lt;a target="_new" href="http://www.langue-fr.net/d/maj_accent/maj_accent.htm"&gt;http://www.langue-fr.net/d/maj_accent/maj_accent.htm&lt;/a&gt;&lt;br&gt;&lt;br&gt;Re the console CP : UTF-8 is the future. Use CHCP 65001 !&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389623</link><pubDate>Tue, 08 Mar 2005 17:07:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389623</guid><dc:creator>Raymond Chen</dc:creator><description>The ^ emphasizes that the redirection is processed by the inner command processor.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389651</link><pubDate>Tue, 08 Mar 2005 17:31:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389651</guid><dc:creator>Joshua Schaeffer</dc:creator><description>&amp;quot; As a Swede with a dotted-character name, this really messes up my console apps :-) &amp;quot;&lt;br&gt;&lt;br&gt;My last name in the original German also has an A-umlaut. Are your spelling rules the same as German ones for eliminating it?</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389665</link><pubDate>Tue, 08 Mar 2005 17:52:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389665</guid><dc:creator>Mihai</dc:creator><description>&amp;quot;Canadian French uses upper-case vowels with accents, which is kind of unique.&amp;quot;&lt;br&gt;This seems to be related to the fact that OEM code pages did not have accented uppercase vowels. So the French user lived with what he had. It was incorrect, but it was all that was available. I have seen this with other languages too.&lt;br&gt;&lt;br&gt;&amp;quot;Also I was surprised to learn that in French, you are supposed to put a non-break space in front of a colon.&amp;quot;&lt;br&gt;And not only colon. Empirical rule is &amp;quot;before any punctuation with two elements and inside quotes&amp;quot;. This means before : ; ? !&lt;br&gt;And the French use chevrons for quotes, like this &amp;#171;&amp;amp;nbsp;quoted&amp;amp;nbsp;&amp;#187; text.&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389673</link><pubDate>Tue, 08 Mar 2005 17:58:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389673</guid><dc:creator>Mihai</dc:creator><description>You can change the code page of the console using chcp and change the font to a non-raster font.&lt;br&gt;For Latin 1 use chcp 1252 and Lucida Console.&lt;br&gt;Then you can &amp;quot;type&amp;quot; a file and looks ok.&lt;br&gt;You can do the same from code (SetConsoleOutputCP for output and SetConsoleCP for input).&lt;br&gt;&lt;br&gt;And small correction to the previous post &amp;quot;before any punctuation with two elements, and inside quotes&amp;quot; (comma added :-)</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389726</link><pubDate>Tue, 08 Mar 2005 19:12:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389726</guid><dc:creator>Christoffer "Kreiger" Hammarström</dc:creator><description>I've always wondered why Microsoft has a fetish for calling its text encodings &amp;quot;ANSI&amp;quot; and &amp;quot;OEM&amp;quot;, instead of using the IANA standardized names. &lt;br&gt;&lt;br&gt;&lt;a target="_new" href="http://www.iana.org/assignments/character-sets"&gt;http://www.iana.org/assignments/character-sets&lt;/a&gt;&lt;br&gt;&lt;br&gt;Doesn't OEM stand for &amp;quot;Original Equipments Manufacturer&amp;quot; ?&lt;br&gt;&lt;br&gt;Also, isn't the ANSI family of text encodings based on an early version of what later became standardized as &amp;quot;iso-8859-*&amp;quot; ?</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389728</link><pubDate>Tue, 08 Mar 2005 19:17:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389728</guid><dc:creator>Wesha</dc:creator><description>I just wonder who was that... ummm... insightful person in Windows development team who decided to invent YET ANOTHER codepage for Cyrillic when developing windows? We ALREADY had three -- national standard KOI8-R, MS-DOS standard CP-866 and Apple's MacCyrillic. And of course M$ couldn't just take the national standard -- or at least its own DOS standard; that'd be way too rational. You had to invent a _totally new one_, like we didn't have enough already. How characteristic.&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389733</link><pubDate>Tue, 08 Mar 2005 19:32:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389733</guid><dc:creator>Christoffer "Kreiger" Hammarström</dc:creator><description>Also, a (non-Microsoft-specific) pet peeve or mine is when someone refers to &amp;quot;Unicode&amp;quot;, without qualifying *what* Unicode-encoding is used. UTF-8? UCS-2? UCS-4?&lt;br&gt;It seems Microsoft most often says &amp;quot;Unicode&amp;quot; to mean &amp;quot;UCS-2&amp;quot;.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389737</link><pubDate>Tue, 08 Mar 2005 19:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389737</guid><dc:creator>Christoffer "Kreiger" Hammarström</dc:creator><description>Raymond: I might add that you have a UTF-8 problem in the &amp;quot;name&amp;quot; form field.&lt;br&gt;My name is pre-entered as 'Christoffer &amp;quot;Kreiger&amp;quot; Hammarstr&amp;#195;&amp;#182;m' when the URL ends with &amp;quot;?Pending=true&amp;quot;, and as normal when the URL doesn't.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389743</link><pubDate>Tue, 08 Mar 2005 19:51:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389743</guid><dc:creator>Raymond Chen</dc:creator><description>I don't run the web site. If you have feedback about the server software, you can send it to Scott W.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389846</link><pubDate>Tue, 08 Mar 2005 22:55:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389846</guid><dc:creator>Michael Grier [MSFT]</dc:creator><description>Re: ANSI vs. OEM:&lt;br&gt;&lt;br&gt;Don't try to read too much into them.  OEM is the DOS code page that the BIOS uses for FAT filesystems.  Thus it's set by the hardware manufacturer.&lt;br&gt;&lt;br&gt;The use of the term &amp;quot;ANSI&amp;quot; for the other codepage is probably a misnomer; it's just a variable which can be set to any of a number of code pages.&lt;br&gt;&lt;br&gt;Re: Unicode vs. UCS-2 vs. Utf-16:&lt;br&gt;&lt;br&gt;We stil have this problem a lot.  Most people don't know anything more about Unicode than it's more complex than ASCII was and maybe less complex than MBCS was, and it takes two bytes per character.  People that I try to explain the truth to usually get a glazed over look in their eyes and ask something like, &amp;quot;yeah, well, that's interesting.  What do I really have to worry about?&amp;quot;&lt;br&gt;&lt;br&gt;The official story is that as of Windows 2000, we consider the two-bytes-per-cell encoding to now be Utf-16 rather than UCS-2 because the core rendering and UI pieces changed to deal with surrogate pairs then.  On the other hand, very little software will avoid splitting surrogate pairs or combining diacritics so YMMV.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389899</link><pubDate>Wed, 09 Mar 2005 00:36:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389899</guid><dc:creator>Matt</dc:creator><description>One time, I was surprised to find out that the Microsoft C compiler converts Unicode strings from ANSI to Unicode (or is it OEM to Unicode :-)  So, if you create an array like TCHAR array[] = { 0x80, 0x81, 0x82, 0x83 }; the result won't be what you expect!&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389922</link><pubDate>Wed, 09 Mar 2005 01:23:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389922</guid><dc:creator>Jon Potter</dc:creator><description>That's because the definition of TCHAR changes depending on whether UNICODE is defined or not. Nothing surprising there...</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389957</link><pubDate>Wed, 09 Mar 2005 01:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389957</guid><dc:creator>Norman Diamond</dc:creator><description>As far as I can tell from old experiments booting Windows 9x in real mode and watching Windows 2000 etc. during cold boot and during wakeup from hibernation, there are more than one OEM code page.  At least one is a US OEM code page (is that 437?) and one is a Japanese OEM code page (932).  The Japanese OEM code page is identical to the Japanese ANSI code page so we don't have problems copying between Notepad and MS-DOS command prompts in any Windows 9x or NT-based system.  But this also means we don't really know what the difference is until Mr. Chen teaches us ^_^&lt;br&gt;&lt;br&gt;However,&lt;br&gt;&lt;br&gt;&amp;gt; when you open an 8-bit text file in Notepad,&lt;br&gt;&amp;gt; it assumes the ANSI code page&lt;br&gt;&lt;br&gt;But which ANSI code page does it assume?  As you pointed out in a previous blog posting, sometimes it even assumes an ANSI code page which the user has never used and the user might not even have installed fonts for it.&lt;br&gt;&lt;br&gt;3/8/2005 2:55 PM Michael Grier [MSFT]&lt;br&gt;&lt;br&gt;&amp;gt; OEM is the DOS code page that the BIOS uses&lt;br&gt;&amp;gt; for FAT filesystems.&lt;br&gt;&lt;br&gt;Huh?????  Since when does the BIOS examine filesystem structures and look for filenames?  And since when is the assumed ANSI code page for filenames chosen by a hardware manufacturer instead of the chosen by whichever language version of Windows is installed?</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#389996</link><pubDate>Wed, 09 Mar 2005 02:07:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:389996</guid><dc:creator>foxyshadis</dc:creator><description>The comment probably need to be clarified; OEM is the original DOS code page that was at the time burned into video memory around $C000 (I believe?). I don't think it was in the bios, unless the video read it out of the bios, the bios just sent raw binary to the video processing chip, which used its burned in code page (actually bitmaps) to render characters, save some reserved ones &amp;lt;32.&lt;br&gt;&lt;br&gt;Eventually it was definitely moved into the OS, copied to ensure minimal change from the huge number of DOS apps then in existence.&lt;br&gt;&lt;br&gt;Thankfully, with unicode we can once more play text mode cards with proper suit pictures in any code page. ^_~</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390021</link><pubDate>Wed, 09 Mar 2005 02:39:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390021</guid><dc:creator>Isaac Chen</dc:creator><description>Christoffer: &amp;quot;It seems Microsoft most often says Unicode to mean UCS-2&amp;quot;&lt;br&gt;IIRC, it should be UTF-16, at least in Windows 2000+ systems.&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390090</link><pubDate>Wed, 09 Mar 2005 04:48:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390090</guid><dc:creator>Mihai</dc:creator><description>For Norman Diamond:&lt;br&gt;Windows terminology:&lt;br&gt;  ANSI Code Page=System Active Code Page (ACP)&lt;br&gt;  OEM Code Page=default console code page&lt;br&gt;Both of them can be double byte or non-Latin 1.&lt;br&gt;&lt;br&gt;See Michael Kaplan's blog:&lt;br&gt;&lt;a target="_new" href="http://blogs.msdn.com/michkap/archive/2005/02/08/369197.aspx"&gt;http://blogs.msdn.com/michkap/archive/2005/02/08/369197.aspx&lt;/a&gt;&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390094</link><pubDate>Wed, 09 Mar 2005 04:53:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390094</guid><dc:creator>Michael Grier [MSFT]</dc:creator><description>The BIOS comment was (mostly) wrong; however the OEM code page is set by the OEM.  FAT (12, 16 and 32) store non-extended directory entries in the OEM code page.  I don't believe it's possible to change the OEM code page easily; specifically it will mess up the character sets used on your FAT drives.&lt;br&gt;&lt;br&gt;LFN directory entries are stored in two-byte-per-character encoded form.  If the filename does not require a short vs. long name it is only stored as a SFN, encoded in the OEM code page.  (This is from reading the FAT32 source code on NT; someone with more history should jump in here and correct my errors and fill in holes.)&lt;br&gt;&lt;br&gt;The OEM code page is associated with the hardware so that when booting between various OSes, they can all agree on the code page used for the FAT volumes.&lt;br&gt;&lt;br&gt;There is assumed to be a single system-wide OEM code page so anything reasonable like recording the code page in the FAT metadata is not done.&lt;br&gt;&lt;br&gt;You can probably have some real fun with FAT removable media in this way going to machines which have different code pages.&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390153</link><pubDate>Wed, 09 Mar 2005 07:28:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390153</guid><dc:creator>Maxime LABELLE</dc:creator><description>Jonathan Perret:  That French capitals should not be accented, in France or elsewhere, is an unfortunately very common misconception&lt;br&gt;&lt;br&gt;It is worth mentioning that this is all the more true. I invite french readers to lookup the official position of the Acad&amp;#233;mie Fran&amp;#231;aise on this topic:&lt;br&gt;&lt;br&gt;&lt;a target="_new" href="http://www.academie-francaise.fr/langue/questions.html#accentuation"&gt;http://www.academie-francaise.fr/langue/questions.html#accentuation&lt;/a&gt;&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390184</link><pubDate>Wed, 09 Mar 2005 08:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390184</guid><dc:creator>Purplet</dc:creator><description>Is this the reason why Alt+128 gives &amp;#199; and Alt+0128 gives € ? &lt;br&gt;Is the leading 0 a way to select the codepage when inserting a character using the keypad ?&lt;br&gt;&lt;br&gt;</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390194</link><pubDate>Wed, 09 Mar 2005 09:15:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390194</guid><dc:creator>Mats Gefvert</dc:creator><description>I always put AutoRun=&amp;quot;chcp 1252&amp;quot; in HKEY_CURRENT_USER\Software\Microsoft\Command Processor.&lt;br&gt;&lt;br&gt;I don't see why I should bother with any other code page anyway... :)</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390214</link><pubDate>Wed, 09 Mar 2005 09:40:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390214</guid><dc:creator>John Elliott</dc:creator><description>foxyshadis: The original OEM codepage was in a ROM on the MDA/CGA cards and didn't show up in the PC's address space at all. I believe GRAFTABL was the first time it appeared in DOS.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390423</link><pubDate>Wed, 09 Mar 2005 12:22:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390423</guid><dc:creator>Timo Frenay</dc:creator><description>Purplet: The leading 0 is to ensure backwards compatibility. Especially in the days of DOS when tools like Character Map weren't easily available, people would memorize Alt+xxx codes that they used frequently. To prevent chaos when they would try to enter these codes in Windows, the leading 0 was introduced for the ANSI codepage.&lt;br&gt;&lt;br&gt;(Although I am a big fan of Character Map, I have memorized the Alt+0xxx codes for most of the European accented vowels.)</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390480</link><pubDate>Wed, 09 Mar 2005 12:43:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390480</guid><dc:creator>Neil</dc:creator><description>Now if only I could set the active code page to UTF-8 in a GUI application, then I could call the A APIs avoiding all the UTF-8 to UTF16 conversions necessary to call all the W APIs...</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390782</link><pubDate>Wed, 09 Mar 2005 16:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390782</guid><dc:creator>Ben Hutchings</dc:creator><description>Timo Frenay: You could make life easier for yourself by using MSKLC to create a keyboard layout with AltGr combinations for the accented vowels.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390877</link><pubDate>Wed, 09 Mar 2005 18:01:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390877</guid><dc:creator>Jonathan</dc:creator><description>You Europeans have it easy...&lt;br&gt;&lt;br&gt;in the PC originally, the character set was indeed saved in the display card's ROM (not 0xC000, somewhere in the 0xF000 segment (remember segments?)). Cards sold in Israel had to have their ROM specially programmed to have Hebrew letters (in the encoding now known as OEM codepage 862) - Cirrus-Logic-based cards were popular partly for this reason. I remember making my own TSR to set this.&lt;br&gt;&lt;br&gt;Same goes for text-mode printers (some ads specifically said &amp;quot;with burned Hebrew&amp;quot; - HP did this a lot).&lt;br&gt;&lt;br&gt;Later, DOS added support to codepages, and you could (through some config.sys lines) re-program the display to any codepage you want. You could also upload something to the printer, though that was always third-party.&lt;br&gt;&lt;br&gt;I never figured out how to display Hebrew in a modern console app, though.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#390904</link><pubDate>Wed, 09 Mar 2005 18:37:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:390904</guid><dc:creator>Sidoine</dc:creator><description>It would be a good thing if all the console program designed for Windows would switch the code page when they start, and use the system code page (1252 for example). The problem is that by default the console is configured to use a font that is not unicode, so you can't display Hebrew. But you can change the console font to lucida console and it works (change HKEY_CURRENT_USER/Console/).&lt;br&gt;There is still a problem: I don't know how to write in unicode. If I try to use the unicode code page (1200), wprintf doesn't work.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#391152</link><pubDate>Wed, 09 Mar 2005 20:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:391152</guid><dc:creator>Michael J.</dc:creator><description>&amp;gt; Re: ANSI vs. OEM:&lt;br&gt;&amp;gt; Don't try to read too much into them. OEM is the DOS code page &lt;br&gt;&amp;gt; that the BIOS uses for FAT filesystems. Thus it's set by &lt;br&gt;&amp;gt; the hardware manufacturer. &lt;br&gt;&lt;br&gt;This is also the page used by DOS-mode printers and by countless printer drivers for printers which do not support national character set. This is also the default encoding for plain text file.&lt;br&gt;&lt;br&gt;&amp;gt; there are more than one OEM code page. At least one is a US OEM &lt;br&gt;&amp;gt; code page (is that 437?) and one is a Japanese OEM code page (932).&lt;br&gt;&lt;br&gt;Of course. Almost any non-latin alphabet coutry has its own codepage. Open any printer manual, and see how many codepages they support.&lt;br&gt;&lt;br&gt;&amp;gt; It would be a good thing if all the console program designed&lt;br&gt;&amp;gt; for Windows would switch the code page when they start,&lt;br&gt;&amp;gt; and use the system code page (1252 for example).&lt;br&gt;&lt;br&gt;Win16 I/O is built on top of DOS I/O and uses DOS conventions. Win32 simply had to support this, because there is only one filename entry per file, not one for DOS, one for Win16 and one for Win32. Now, wait: it is one for DOS and another for Windows in FAT32. Then DOS/Win3.x apps can have the good old codepage, and Win16-LFN/Win32 apps can have Unicode.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#391160</link><pubDate>Wed, 09 Mar 2005 20:50:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:391160</guid><dc:creator>Ben Hutchings</dc:creator><description>Sidoine: The standard C library is intended to work with wide characters only internally. They are always converted to multibyte characters on output. If you want to write Unicode text to the console, you have two options that I can see:&lt;br&gt;(1) Set the console to display UTF-8 (chcp 65001) and set the C library to convert to that on output (I don't know how).&lt;br&gt;(2) Write UTF-16 to the console with the Win32 function WriteConsoleW. Of course you'll need to use WriteFile instead if output has been redirected.&lt;br&gt;&amp;quot;chcp 1200&amp;quot; results in the error message &amp;quot;Invalid code page&amp;quot;.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#391443</link><pubDate>Thu, 10 Mar 2005 03:46:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:391443</guid><dc:creator>Norman Diamond</dc:creator><description>3/8/2005 8:53 PM Michael Grier [MSFT]&lt;br&gt;&lt;br&gt;&amp;gt; FAT (12, 16 and 32) store non-extended&lt;br&gt;&amp;gt; directory entries in the OEM code page.&lt;br&gt;&lt;br&gt;If you mean that software, somewhere in a filesystem layer for FAT filesystems, uses the currently set OEM code page, then I think I agree.  But you say &amp;quot;store&amp;quot;, so I think you mean that the recorded structures on the disk drive say what code page was used in writing the short filenames in their FATs, and that is surely wrong.&lt;br&gt;&lt;br&gt;&amp;gt; I don't believe it's possible to change the&lt;br&gt;&amp;gt; OEM code page easily;&lt;br&gt;&lt;br&gt;To change it where?  If you're talking about a copy burnt into video ROM then you're obviously right, but...&lt;br&gt;&lt;br&gt;If you're talking about in software:  When Windows 9x is booting, while in real mode before loading the GUI and protected mode drivers, it does change the code page that is in use.  If you don't have a graphical logo being displayed then you can watch the text messages and watch the character set change.  After Windows 9x is already running, and also in Windows NT-based systems, you can open a DOS-style command prompt.  The &amp;quot;US&amp;quot; command and &amp;quot;JP&amp;quot; command change the code page for that window.  There's also a &amp;quot;MODE&amp;quot; command that seems able to change the code page for that window, and I think I once tried to select a European code page for it but didn't get very far.&lt;br&gt;&lt;br&gt;If you're talking about on a hard drive:  I still do not believe that the information stored in a partition says what code page was used in writing short filenames.&lt;br&gt;&lt;br&gt;&amp;gt; specifically it will mess up the character&lt;br&gt;&amp;gt; sets used on your FAT drives. &lt;br&gt;&lt;br&gt;It messes up the interpretation of filenames that are read from FAT drives, because the FAT does not say which code page was used, and the driver has to assume that the current Windows code page should be used.&lt;br&gt;&lt;br&gt;[Omitting one paragraph that I agree with.]&lt;br&gt;&lt;br&gt;&amp;gt; The OEM code page is associated with the&lt;br&gt;&amp;gt; hardware so that when booting between&lt;br&gt;&amp;gt; various OSes, they can all agree on the code&lt;br&gt;&amp;gt; page used for the FAT volumes.&lt;br&gt;&lt;br&gt;This 100% does not happen.  If you don't tell various OSes what code page you've been using on a FAT volume, they sure do not agree.  And in OSes where you cannot tell the OS what code page you've been using on a FAT volume, the OS will use the code page that it's been assuming and you end up with a partition which cannot be fully accessed by any language version of such an OS.&lt;br&gt;&lt;br&gt;&amp;gt; There is assumed to be a single system-wide&lt;br&gt;&amp;gt; OEM code page&lt;br&gt;&lt;br&gt;Some OSes do that ... as I just mentioned ...&lt;br&gt;&lt;br&gt;&amp;gt; so anything reasonable like recording the&lt;br&gt;&amp;gt; code page in the FAT metadata is not done. &lt;br&gt;&lt;br&gt;Um, I'm glad to see that you understand that (really), but then ... how is it possible that you wrote the nonsense that you did two paragraphs earlier?&lt;br&gt;&lt;br&gt;&amp;gt; You can probably have some real fun with FAT&lt;br&gt;&amp;gt; removable media in this way going to&lt;br&gt;&amp;gt; machines which have different code pages. &lt;br&gt;&lt;br&gt;Same in a single machine, and same in an internal drive.  For example Microsoft used to support the possibility of installing different language versions of NT4 into different partitions in the same machine, and they could all view all partitions, but they could not access all files.  Microsoft did not support the same with Windows 9x but did provide some downloads to assist users who wanted to do that in Windows 95 (it was tougher in Windows 98 because the installer hunted down and destroyed the registries of existing different language versions of Windows 98).  They could view all FAT partitions but could not access all files.  Scandisk could corrupt some filenames and could delete others but sometimes could not finish the job of deleting a filename that it corrupted.</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#391551</link><pubDate>Thu, 10 Mar 2005 07:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:391551</guid><dc:creator>Kim Gräsman</dc:creator><description>Joshua:&lt;br&gt;&amp;quot;My last name in the original German also has an A-umlaut. Are your spelling rules the same as German ones for eliminating it?&amp;quot;&lt;br&gt;&lt;br&gt;I'm not sure we have any steadfast rules for substitution, but the following is what you generally see used:&lt;br&gt;&lt;br&gt;  &amp;#228; = ae&lt;br&gt;  &amp;#246; = oe&lt;br&gt;  &amp;#229; = aa&lt;br&gt;&lt;br&gt;Judging by your last name, that seems to match German rules, at least partially.&lt;br&gt;&lt;br&gt;Thanks everyone for the heads-up on CHCP!&lt;br&gt;&lt;br&gt;- Kim</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#395008</link><pubDate>Mon, 14 Mar 2005 00:42:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:395008</guid><dc:creator>Norman Diamond</dc:creator><description>3/8/2005 11:12 AM Christoffer &amp;quot;Kreiger&amp;quot; Hammarstr&amp;#246;m&lt;br&gt;&lt;br&gt;&amp;gt; I've always wondered why Microsoft has a&lt;br&gt;&amp;gt; fetish for calling its text encodings &amp;quot;ANSI&amp;quot;&lt;br&gt;&amp;gt; and &amp;quot;OEM&amp;quot;, instead of using the IANA&lt;br&gt;&amp;gt; standardized names. &lt;br&gt;&amp;gt; &lt;a target="_new" href="http://www.iana.org/assignments/character-sets"&gt;http://www.iana.org/assignments/character-sets&lt;/a&gt;&lt;br&gt;&lt;br&gt;One reason might be because that page gives names for individual encodings of character sets but does not give names for two or three overall categories, at least not that I could see.&lt;br&gt;&lt;br&gt;A more generic reason for people possibly ignoring that page might be that IANA seems to ignore bug reports.  (I didn't even write cynically, it was my first correspondence with them and I had no reason to disrespect them, I hadn't even bought any broken products from them and am not subject to product tying when wanting to buy hardware, so I didn't even think of writing disrespectfully to them.  Well, that was at the time of writing the bug report.)</description></item><item><title>re: Keep your eye on the code page</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#404514</link><pubDate>Fri, 01 Apr 2005 12:49:00 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:404514</guid><dc:creator>Lars</dc:creator><description>&amp;gt; Re the console CP : UTF-8 is the future. Use CHCP 65001 ! &lt;br&gt;&lt;br&gt;CHCP 65001 indeed works quite well, but whenever I tried to run a batch file (pure ASCII!) from such a console, it never worked. There no output, no error error message, and the commands are not executed.&lt;br&gt;</description></item><item><title>What is the deal with the ES_OEMCONVERT flag?</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#440405</link><pubDate>Tue, 19 Jul 2005 17:00:13 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:440405</guid><dc:creator>The Old New Thing</dc:creator><description>It's pointless today.</description></item><item><title>Why is the default console codepage called &amp;amp;quot;OEM&amp;amp;quot;?</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#457486</link><pubDate>Mon, 29 Aug 2005 17:00:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:457486</guid><dc:creator>The Old New Thing</dc:creator><description>Because it once was, though no longer is.</description></item><item><title>Keep your eye on the code page, practical exam</title><link>http://blogs.msdn.com/oldnewthing/archive/2005/03/08/389527.aspx#541267</link><pubDate>Wed, 01 Mar 2006 18:00:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:541267</guid><dc:creator>The Old New Thing</dc:creator><description>Occasionally, somebody fails to pass.</description></item></channel></rss>