Van's House

I'm a developer at C++ Shanghai team. I'm interested in everything related to C++

Output Text in Unicode

According to C standard, it only supports output text in MBCS (Multi-Byte Character String):

n1124.pdf, 7.19.3/12

The wide character output functions convert wide characters to multibyte characters and write them to the stream as if they were written by successive calls to the fputwc function. Each conversion occurs as if by a call to the wcrtomb function, with the conversion state described by the stream’s own mbstate_t object. The byte output functions write characters to the stream as if by successive calls to the fputc function.

However, MBCS doesn’t support mixing characters in different code page. For example, you can’t use both Chinese and Japanese.

VC provides extension to allow you to output text in Unicode:

#include <cstdio>
#include <locale.h>

#include <io.h>
#include <fcntl.h>

void OutputMBCS(FILE *f)
{
    // ".936" is the code page for Simplified Chinese
    // However, you can't use ".1200" (code page for Unicode) to output text in Unicode
    setlocale(LC_CTYPE, ".936");
    // My name in Chinese: "范翔"
    fwprintf(f, L"%s", L"\x8303\x7FD4");

    // The text in the file is encoded in GBK
}

void OutputUnicode(FILE *f)
{
    _setmode(_fileno(f), _O_U16TEXT);
    fwprintf(f, L"%s", L"\x8303\x7FD4");

    // The text in the file is encoded in Unicode
}

For more information, please check the following post: http://blogs.msdn.com/michkap/archive/2008/03/18/8306597.aspx

Published Tuesday, August 04, 2009 2:35 PM by xiangfan
Filed under:

Comments

No Comments
Anonymous comments are disabled

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker