<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx</link><description>In the world of Unicode, it is small irony that what usually causes the most emails to be exchanged and the most documents to be written are the characters that have no actual visible representation. 
 Whether it is U+feff (a.k.a. ZERO WIDTH NO BREAK</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>The subtle difference between ශ්රී ලංකාව and ශ්‍රීලංකාව</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#5451718</link><pubDate>Sun, 14 Oct 2007 18:59:37 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:5451718</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;p&gt;(The two words might look alike if you don't have the latest and greatest set to render on your machine!)&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=5451718" width="1" height="1"&gt;</description></item><item><title>Why don't all the half forms sort right?</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#770197</link><pubDate>Mon, 25 Sep 2006 11:47:15 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:770197</guid><dc:creator>Sorting It All Out</dc:creator><description>&lt;br&gt;George asked via the Contacting Me... link: &lt;br&gt;&lt;br&gt;I tried to use the Unicode method of creating half...&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=770197" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#593459</link><pubDate>Tue, 09 May 2006 11:11:12 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:593459</guid><dc:creator>UdaraG</dc:creator><description>Thanks very much, Michael!&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=593459" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#592374</link><pubDate>Mon, 08 May 2006 17:20:41 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:592374</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Well, the shaping engine support will not yet be there in the CE Uniscribe, so there really is no way to make it all happen just yet. &lt;BR&gt;&lt;BR&gt;I do not know the timeline for updates on the mobile platform, but if I find out, I'll post about it....&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=592374" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#592281</link><pubDate>Mon, 08 May 2006 14:42:45 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:592281</guid><dc:creator>UdaraG</dc:creator><description>Hi Michael;&lt;br&gt;Thanks for the prompt reply, and extremely sorry for the much delayed follow up!&lt;br&gt;&lt;br&gt;Yes, I do understand the licensing concerns very well - myself being a software developer... - the only reason I modified the Tahoma font was since I had a small problem getting the font linking to work. :-)&lt;br&gt;&lt;br&gt;Now everything is perfect, expect for the ligatures - of which Sinhala language has a number of. I have tried so many devices and WinCE OS versions, including the very latest, but to no avail!&lt;br&gt;&lt;br&gt;My conclusion is what you guessed right at the start - &amp;quot;updates for Sinhala that were added so recently are NOT in the CE version of Uniscribe&amp;quot;.&lt;br&gt;&lt;br&gt;1. Is there a way to ascertain this from anybody at MS, so that I need not worry about this any further?&lt;br&gt;&lt;br&gt;2. From the ref-link you provided above, looks like i have no option short of moulding out an OS image (with WinCE Platform Builder) with Sinhala language support.&lt;br&gt;What may be the next steps in building support for a custom language (Sinhala) with WinCE PB?&lt;br&gt;Assuming I work fulltime on it ,may be even burning some midnight oil :-), what's the effort/time estimate look like? I do have extensive C/C++ experience with me, and have already downloaded the evaluation copy of WinCE PB (my employer is hoping to buy it soon).&lt;br&gt;However, this is my first stint with WinCE PB, and am expecting a considerable learning curve.&lt;br&gt;&lt;br&gt;Thanks and regards,&lt;br&gt;Udara&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=592281" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#580583</link><pubDate>Fri, 21 Apr 2006 16:05:23 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:580583</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Hi Udara,&lt;br&gt;&lt;br&gt;I see a few problems with what you have mentioned here:&lt;br&gt;&lt;br&gt;&amp;quot;I imported the Sinhala character range from a Unicode-compatible font (called &amp;quot;Malithi Web&amp;quot;) into the standard MS Tahoma font&amp;quot;&lt;br&gt;&lt;br&gt;Of course the licensing concern is a very real one -- I do not know whether it is legal to modify Tahoma in such a way?&lt;br&gt;&lt;br&gt;Beyond that, there is the problem of making sure all of the opentype tables are moved over properly. It is not enough to be 'Unicode-compatible' as Arial in Win95 is that; what is important is the right OT table info.&lt;br&gt;&lt;br&gt;And beyond that, see &lt;a rel="nofollow" target="_new" href="http://blogs.msdn.com/michkap/archive/2005/05/19/420145.aspx"&gt;http://blogs.msdn.com/michkap/archive/2005/05/19/420145.aspx&lt;/a&gt; which points out that Uniscribe is not present in every WinCE install, even if it is of the latest version.&lt;br&gt;&lt;br&gt;Of course beyond all that would be the question of whether the updates for Sinhala that were added so recently are in the CE version of Uniscribe. They may not have the XPSP2/ELK/Vista updates just yet....&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=580583" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#580458</link><pubDate>Fri, 21 Apr 2006 11:24:03 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:580458</guid><dc:creator>UdaraG</dc:creator><description>Hi Michael;&lt;br&gt;&lt;br&gt;Firstly, I am a Sri Lankan who uses Sinhala as my mother tongue, and was quite fascinated to see your use of the same in the example above!&lt;br&gt;&lt;br&gt;Secondly, I am at the early stages of developing a Sinhala IME for the PocketPC, and am having problems in sending the ZWJ along with leading/trailing characters so that it could be correctly interpreted and rendered on, say Pocket Word.&lt;br&gt;&lt;br&gt;1. I imported the Sinhala character range from a Unicode-compatible font (called &amp;quot;Malithi Web&amp;quot;) into the standard MS Tahoma font, which I have copied over to the PPC.&lt;br&gt;2. With a rudimentary IME developed in eVC 4.0, I did a IMCallback.SendString() for the Unicode string {0x0DC1, 0x0DCA, 0x200D, 0x0DBB, 0x0DD3, NULL}.&lt;br&gt;&lt;br&gt;Though this should give me (as I understood) the first form in your example above, it actually gives me the second non-ligated form with a verical bar (which I presume to be the ZWJ) drawn in between the two characters!&lt;br&gt;&lt;br&gt;What am I doing wrong?&lt;br&gt;Should I use IMCallback.SendCharEvents() instead?&lt;br&gt;Can you please point me to some code samples?&lt;br&gt;&lt;br&gt;Thanks and regards,&lt;br&gt;Udara&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=580458" width="1" height="1"&gt;</description></item><item><title>What do you get when you combine a base character with a buttload of diacritics?</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#534075</link><pubDate>Fri, 17 Feb 2006 16:51:53 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:534075</guid><dc:creator>Sorting It All Out</dc:creator><description>The other day I was looking at a particular bug repro (it was actually that BACKSPACE vs. DELETE bug...&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=534075" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#533229</link><pubDate>Thu, 16 Feb 2006 17:17:27 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:533229</guid><dc:creator>Michael S. Kaplan</dc:creator><description>Well, take a look at &lt;a rel="nofollow" target="_new" href="http://blogs.msdn.com/michkap/archive/2006/02/16/533226.aspx"&gt;http://blogs.msdn.com/michkap/archive/2006/02/16/533226.aspx&lt;/a&gt; which talks about this issue a bit -- definitely more complicated than disallowing a few characters! :-)&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=533229" width="1" height="1"&gt;</description></item><item><title>re: Every character has a story #19: U+200c and U+200d (ZERO WIDTH [NON] JOINER)</title><link>http://blogs.msdn.com/b/michkap/archive/2006/02/15/532394.aspx#533011</link><pubDate>Thu, 16 Feb 2006 08:56:17 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:533011</guid><dc:creator>Yosuke HASEGAWA</dc:creator><description>Using ZERO WIDTH (NON) JOINER or ZWNBSP(BOM) to filename or registory key and values, you can create several files that appearance is the same name.&lt;br&gt;&lt;br&gt;This may cause visual problems in security domain. So I hope to disable using Unicode control characters for filename in Windows.&lt;br&gt;Of course, Bidi control characters such as &amp;quot;RLO&amp;quot; too.&lt;br&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=533011" width="1" height="1"&gt;</description></item></channel></rss>