chcp can't do everything

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

chcp can't do everything

  • Comments 22

The chcp.com utility is a simple little program sitting in the \WINDOWS\SYSTEM32 subdirectory. Running it with /? willl give some helpful information about its purpose:

C:\WINDOWS\system32>chcp /?
Displays or sets the active code page number.

CHCP [nnn]

  nnn   Specifies a code page number.

Type CHCP without a parameter to display the active code page number.

There is also more information in the Windows XP documentation, which does hint at a problem in its small list of "supported" code pages:

Code page Country/region or language

437

United States

850

Multilingual (Latin I)

852

Slavic (Latin II)

855

Cyrillic (Russian)

857

Turkish

860

Portuguese

861

Icelandic

863

Canadian-French

865

Nordic

866

Russian

869

Modern Greek

None of the ACP values are there, though this is I think a bit of social engineering -- to keep people thinking of it as the OEM code page. The 125x series code pages also work well here.

However, another set that is missing from the list is the ideographic code pages. You cannot use chcp to change to one of the ideographic code pages unless it is also the default system OEM code page.

Thus on a system with an 0x0409 default system code page:

C:\WINDOWS\system32>chcp 932
Invalid code page

C:\WINDOWS\system32>chcp 936
Invalid code page

C:\WINDOWS\system32>chcp 949
Invalid code page

C:\WINDOWS\system32>chcp 950
Invalid code page

This is a known and expected limitation for which there is no workaround....

 

This post brought to you by "Ā" (U+0100, a.k.a. LATIN CAPITAL LETTER A WITH MACRON)

 

Comment on the blather
Leave a Comment
  • Please add 6 and 4 and type the answer here:
  • Post
Blog - Comment List
  • While we're at it, why not try "chcp 65001" (65001 = CP_UTF8)? Amazing but it actually works... but only if you set the console to use a TrueType font (and of course, unless you switch to full-screen text mode).

    Too bad Lucida Console doesn't contain Hebrew glyphs. Given that there are other monospaced TrueType fonts in my system (such as Courier New, which happens to contain Hebrew glyphs), what makes a TrueType font appear in the console's Properties | Font screen?
  • When you enable codepage 65001, batch files and cmd scripts no longer run. No idea why, but it's a bit of a show-stopper for chcp 65001.
  • Ilya: See here: http://support.microsoft.com/default.aspx?scid=kb;EN-US;Q247815
  • "However, another set that is missing from the list is the ideographic code pages. You cannot use chcp to change to one of the ideographic code pages unless it is also the default system OEM code page."

    I installed support for East Asian Languages into my English Windows XP Pro system, and have documents with filenames using ideograph characters. Is there really no way for me to work in the command shell (cmd.exe) with these files? Do I need to go buy a version of XP for the East Asian language I am interested in? I was hoping there was something easier I could do.
  • It means you cannot change to one of the CJK code pages, Mike. You can certainly try 'chcp 65001' and you can also try 'cmd /u' to see if you can work with them.

    Or you can even change the default system locale and then the oemcp will match by default if you switch to the right one.

    Lots of options....
  • Thanks for the suggestions Michael.

    I have seen mention of code page 65001, but I don't see any effect by switching to it. I have created test data in Arabic, in Cyrillic, and in Japanese, and code page 65001 cannot display any of it correctly. At certain font sizes, if I choose Courier New as my console font I can work with Arabic and Cyrillic (on any code page, does not require a chcp).

    Likewise, 'cmd /u' does not appear to affect the display, although it does a very good job of creating proper Unicode output. That is, I can do a 'dir' and still not *see* anything in the console, but if I do a 'dir > results.txt' then I get a Unicode text file (I believe) that is readable with Notepad and all of the characters display there correctly.

    I will attempt to change the system locale but had hesitated on trying this because I was hoping to avoid system-wide changes requiring a reboot, whenever I needed to work with certain file data.
  • Hi Mike,

    Well, perhaps moving out of the [legacy] console world might be the best solution, in that case? Unicode apps have a much easier time when they are not stuck there....

    With that said, I had very little trouble converting console projects to Unicode in the past (I'll be blogging about this soon)....
  • Apologies to Stanislav Kniazev -- I removed the table, since it is really unreadable in that format? Better to just provide tre link to the MSDN topic, instead?
  • If you need unicode output in file, you must use command CMD with option /U.
    If you need unicode output in MS console, you must use the following command:
    chcp 65001 && <your_command>, where <your_command> is any command or batch file. Font property for MS console in this case must be of course changed to "Lucida Console".
    In the following table you find code pages for all charsets:
    http://msdn.microsoft.com/workshop/author/dhtml/reference/charsets/charset4.asp
  • I got here via a mutual friend (Google).  I want to use the 1252 code page.  Chcp 1252 seems to respond well (it says 1252 is now active) but then when I type a file that contains upper ascii in it, it still display it as one would expect under page 437.  I try writing to the screen from a program, I try to type characters directly holding down ALT and typing the decimal value - it still show me the page 437 characters, not the 1252 characters.  As far as I can tell, all CHCP does under XP is tell you that your page is active, otherwise nothing. :(  I would greatly appreciate any pointers.  Thank.

  • Change out of the raster font, perhaps? Move to Lucida Console....

  • interesting reading here, thank you.

    I've had an occurrance of the path variable on xp pro being displayed correctly once only as human readable then as ascii chars only.

    turned out the codepage for this machine was set to 850, if I then manually set the codepage to 437, the path variable remains human readable [that is; from a command prompt screen output].

    the mystery is ; the machine is set to australian english in regional settings; there's no multi linguallity [is that a word?] other than that.

    what else could possibly cause the path variable [and it's the only environment variable to be affected] to display ascii chars?

    I'm also making the assumption that the ascii chars may cause some apps. to not read the path var. properly, right?

  • This blog dedicated to six weeks I almost got to spend in Australia nearly 15 years ago, and a Kinks

  • >If you need unicode output in file, you must use command CMD with option /U.

    I've tried

    CMD /U Tree /a > temp.txt

    in line 1 of batch file, which stop in this command and nothing is actually executed.

    temp.txt is not generated and the script does not continue.

    Using a Japanese Windows XP, eastern language enabled of course

    And for CHCP, the dos batch script will just close after running CHCP, anything below CHCP will not be executed as well.

  • Is there a replacement for chcp on a 64bit client, I was browsing through the system32 and found chcp,

    but looking through sysWoW64,  I saw nothing that resembles a chcp setting.

Page 1 of 2 (22 items) 12