<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Is CharNextExA broken?</title><link>http://blogs.msdn.com/b/michkap/archive/2007/04/19/2190207.aspx</link><description>Jochen Kalmbach asks over in the Suggestion Box: 
 
 Hi Michael! 
 Short question: Is "CharNextExA" broken in XP (or generally borken)? 
 It does not recognize UTF8... Here is a small example: 
 #include &amp;lt;windows.h&amp;gt; #include &amp;lt;tchar.h&amp;gt;</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>What type to use for code page values</title><link>http://blogs.msdn.com/b/michkap/archive/2007/04/19/2190207.aspx#2322360</link><pubDate>Sun, 29 Apr 2007 10:21:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2322360</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;I was asked via the Contact link: Why does CharPrevExA/CharNextExA take a WORD for code page, whereas&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=2322360" width="1" height="1"&gt;</description></item><item><title>re: Is CharNextExA broken?</title><link>http://blogs.msdn.com/b/michkap/archive/2007/04/19/2190207.aspx#2205939</link><pubDate>Fri, 20 Apr 2007 14:13:25 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:2205939</guid><dc:creator>Michiel</dc:creator><description>&lt;p&gt;First rough solution:&lt;/p&gt;
&lt;p&gt;1. IsDBCSLeadByteEx is documented as returning non-zero if it's a lead byte. Make it return the number of characters following for a lead byte. (By definition 1 for DBCS)&lt;/p&gt;
&lt;p&gt;2. Update its documentation to say &amp;quot;A lead byte is the first byte of a character sequence in a double-byte character set (DBCS) or multibyte character set (MBCS) for the code page.&amp;quot;&lt;/p&gt;
&lt;p&gt;3. CharNextExA uses the returned number of characters from IsDBCSLeadByteEx.&lt;/p&gt;
&lt;p&gt;This approach takes advantage of the fact that UTF-8 was designed explicitly to determine the number of bytes in a character sequence from its lead byte.&lt;/p&gt;
&lt;p&gt;CharPrevExA is harder, as IsDBCSLeadByteEx has to return 0 for both single-byte characters and the non-lead-bytes. And you can't decrement twice without risking to underrun a buffer (the caller must obviously make sure there is one preceding character, but not two). Hence, only one byte is available. If that's a single-byte character, we're done, else we search back until we find the first lead byte. The remaining problem here is determining single-byte characters.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=2205939" width="1" height="1"&gt;</description></item></channel></rss>