<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx</link><description>In the Suggestion Box, rob asked the following question: Michael, As posted over at Raymond Chen's blog. What is the best way to display all the characters in i.e. codepage 932 (Japanese) and other codepage that is supported on Windows (post win2k era).</description><dc:language>en-US</dc:language><generator>CommunityServer 2.1 SP1 (Build: 61025.2)</generator><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510533</link><pubDate>Sun, 08 Jan 2006 10:52:29 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510533</guid><dc:creator>Mike Dunn</dc:creator><description>As a variation on #1, how about passing each of the possible lead+trail byte combos to _ismbclegal?</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510541</link><pubDate>Sun, 08 Jan 2006 12:08:51 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510541</guid><dc:creator>Mihai</dc:creator><description>The only problem (with all of the solutions) is that not all code pages can be covered like this. The main exception is GB-18030, which has characters outside BMP.&lt;br&gt;&lt;br&gt;Sure, one might extend the range beyond BMP, but the performance goes down and the consumed memory goes up.&lt;br&gt;In this case a variant of #1 might give better results (whith the warning that the complications and care will be even bigger :-)&lt;br&gt;</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510560</link><pubDate>Sun, 08 Jan 2006 17:53:33 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510560</guid><dc:creator>Michael S. Kaplan</dc:creator><description>For both GB18030 and UTF-8, *all* of Unicode is covered, so if it is a characetr, its oin the code page. And that's that. Easy!&lt;br&gt;&lt;br&gt;#1 is never easier, though -- it is always more work....</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510561</link><pubDate>Sun, 08 Jan 2006 17:54:39 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510561</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Mike, that is still more work by the time you are done, certainly more complicated to implement....</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510606</link><pubDate>Sun, 08 Jan 2006 23:24:43 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510606</guid><dc:creator>Mihai</dc:creator><description>&amp;quot;For both GB18030 and UTF-8, *all* of Unicode is covered&amp;quot;&lt;br&gt;True. But the proposed solutions #2 and #3 take BMP only.&lt;br&gt;&lt;br&gt;&amp;quot;#1 is never easier, though -- it is always more work....&amp;quot;&lt;br&gt;True again :-)&lt;br&gt;But I was talking about better results in performance and consumed memory.&lt;br&gt;Sometimes you have to work more for these two.</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510610</link><pubDate>Sun, 08 Jan 2006 23:52:43 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510610</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Ah, you missed my point -- you do not need to do anything for UTF-8 and  GB18030 -- no conversion needed at all!&lt;br&gt;&lt;br&gt;It is all in there.</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510724</link><pubDate>Mon, 09 Jan 2006 11:11:27 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510724</guid><dc:creator>Mihai</dc:creator><description>&amp;quot;Ah, you missed my point&amp;quot;&lt;br&gt;True :-) But now I get it.&lt;br&gt;&lt;br&gt;But do you mean GB18030 contains everything that is in Unicode? I thought it is a (big) subset. So it is a bit like 932, only much bigger, but still not covering all of Unicode. Or is it?</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510752</link><pubDate>Mon, 09 Jan 2006 14:56:02 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510752</guid><dc:creator>Michael S. Kaplan</dc:creator><description>GB18030 is completely tied to Unicode as it is defined, and thus everything in Unicode is in GB18030.</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510924</link><pubDate>Mon, 09 Jan 2006 23:14:57 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510924</guid><dc:creator>Rob</dc:creator><description>Michael,&lt;br&gt;&lt;br&gt;Thanks for the explanation.  If I use recommendation #3 how would I distinguish what characters belong to what code page?  Would I use the Hex range for determine the code page?&lt;br&gt;&lt;br&gt;Thanks&lt;br&gt;Rob</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510936</link><pubDate>Mon, 09 Jan 2006 23:40:13 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510936</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Hi Rob,&lt;br&gt;&lt;br&gt;For #3 -- check the return value of the WideCharToMultiByte call -- if the conversion succeeds, it is a valid part of the code page you used to convert, and the byte(s) are in the multibyte param. If it fails, then skip to the next character....</description></item><item><title>supplementary?</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510963</link><pubDate>Tue, 10 Jan 2006 00:29:25 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510963</guid><dc:creator>Maurits</dc:creator><description>This works... until a code page comes out with a supplementary code point in it...</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#510964</link><pubDate>Tue, 10 Jan 2006 00:33:54 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:510964</guid><dc:creator>Michael S. Kaplan</dc:creator><description>There are no new code pages coming in Windows, sorry! :-)</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#511257</link><pubDate>Tue, 10 Jan 2006 20:32:27 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:511257</guid><dc:creator>Rob</dc:creator><description>Michael,&lt;br&gt;&lt;br&gt;Thanks for the info.  I'll try #3 and store all the valid code point into a vector of CString and then print them into a file and open them up in IE.  Viewing in IE will allow me to see all the valid characters.&lt;br&gt;&lt;br&gt;Rob</description></item><item><title>re: Getting the characters in a code page</title><link>http://blogs.msdn.com/michkap/archive/2006/01/07/510411.aspx#514997</link><pubDate>Thu, 19 Jan 2006 22:56:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:514997</guid><dc:creator>CK </dc:creator><description>Hi Michael,&lt;br&gt;&lt;br&gt;Any code sample(s) for recommendation #3?  Thanks in advance if you have any.&lt;br&gt;&lt;br&gt;Thanks&lt;br&gt;CK</description></item></channel></rss>