<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://blogs.msdn.com/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx</link><description>Previous blogs in this series of blogs on this Blog: 

 
 Part 0: 
The intro, sans content 
 

 Part 1: 
Getting the obvious out of the way 
 

 Part 2: 
A&amp;amp;P of a 'linguistic character' 
 

 Part 3: 
It starts with cursor movement</description><dc:language>en-US</dc:language><generator>Telligent Evolution Platform Developer Build (Build: 5.6.50428.7875)</generator><item><title>Whither WM_UNICHAR in Windows 7 (and 8!)</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#10308203</link><pubDate>Mon, 21 May 2012 14:04:34 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10308203</guid><dc:creator>Sorting it all Out</dc:creator><description>&lt;p&gt;The WM_UNICHAR message has had an interesting history. &lt;/p&gt;
&lt;p&gt; Over time it has come up in this Blog occasionally&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10308203" width="1" height="1"&gt;</description></item><item><title>Should considering UTF-16 be harmful be considered harmful?</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#10298464</link><pubDate>Fri, 27 Apr 2012 14:03:42 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:10298464</guid><dc:creator>Sorting it all Out</dc:creator><description>&lt;p&gt;Like many of the people I know, I find myself looking over at Stack Overflow and related sites periodically&lt;/p&gt;
&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=10298464" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9800494</link><pubDate>Wed, 24 Jun 2009 02:45:45 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9800494</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;Not all clipboard cases grab surrounding spaces and most do not if you do not select the space.&lt;/p&gt;
&lt;p&gt;The Bidi marks are another case, one I talk about tomorrow....&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9800494" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9800274</link><pubDate>Wed, 24 Jun 2009 00:57:57 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9800274</guid><dc:creator>John Cowan</dc:creator><description>&lt;p&gt;Cut/paste isn't quite the same thing as selection: in some contexts, when we cut selected text, following whitespace also goes with it. &amp;nbsp;In that case, I'd probably extend the selection to include any following invisible character (what is currently done for bidi marks in that situation?)&lt;/p&gt;
&lt;p&gt;In addition to Andrew's reasons, another reason why VSes shouldn't be used for simplified characters is that N traditional characters often map to the same simplified character, so it would be impossible to tell by looking at a text what underlies it. &amp;nbsp;There are other situations where this is true, like:&lt;/p&gt;
&lt;p&gt;english = CIBARA&lt;/p&gt;
&lt;p&gt;(which word do you read first?), but that's no reason to add to it.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9800274" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9732179</link><pubDate>Fri, 12 Jun 2009 12:25:20 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9732179</guid><dc:creator>Andrew West</dc:creator><description>&lt;p&gt;&amp;quot;new simplified characters should be traditional characters with variation selectors&amp;quot;&lt;/p&gt;
&lt;p&gt;I know that their are certain people in the UTC who favour this approach, but I don't think it has any chance of being adopted, because there would be massive opposition from IRG and WG2. At this stage in the game it is simply too late to change the model for Han encoding; and all we can do is bite the bullet and encode all required simplified character forms as separate characters (about half of the CJK-D characters currently under ballot are simplified forms of existing traditional characters).&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9732179" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9732113</link><pubDate>Fri, 12 Jun 2009 12:17:39 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9732113</guid><dc:creator>Andrew West</dc:creator><description>&lt;p&gt;I think I'm too close to variation selectors to be able to answer your questions objectively. I like to have full control over my text, including variation selectors, but I can see that many users would be annoyed if the appearance of their text kept getting screwed up because of some invisible entity that was not staying in its place.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9732113" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9726939</link><pubDate>Thu, 11 Jun 2009 19:20:39 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9726939</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;As a by the way, this is one of the many reasons I was against the VS notion the entire time it has been alive. &lt;/p&gt;
&lt;p&gt;Does the answer to any above change with the argument (of some, and which I also disagree with) that new simplified characters should be traditional characters with variation selectors?&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9726939" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9726932</link><pubDate>Thu, 11 Jun 2009 19:18:24 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9726932</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;Okay, differences in opinion are good, always. :-)&lt;/p&gt;
&lt;p&gt;But I am curious, especially of John and Andrew but really of everyone: in the myriad of cases, what is the best result? Imagine (in addition to the specific scenarios John and Andrew mentioned) the following:&lt;/p&gt;
&lt;p&gt;-- Delete a section, included an alternate glyph at the end &amp;nbsp;-- Should the VS be left in the old text, and does the answer change when the VS is alone at the beginning vs. attached to a new (possibly illegal or potentially worse legal but unintended choice of) base character?&lt;/p&gt;
&lt;p&gt;-- Copy/Paste -- All of the above, plus: -- should the alternate glyph be lost? &lt;/p&gt;
&lt;p&gt;You see what I am getting at -- what do you think should happen with the orphaned VS, which is now just in the text, unintended? I am worried about two principles, 1) what to do with the original intent of the text, and 2) what to do with the potentially illegal, potentially unintended result of the remaining text....&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9726932" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9726625</link><pubDate>Thu, 11 Jun 2009 17:10:32 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9726625</guid><dc:creator>Michael S. Kaplan</dc:creator><description>&lt;p&gt;Okay, an opinion not brought up with Unicode (or at least not written in for its own recommendations...).&lt;/p&gt;
&lt;p&gt;But let's take the cut/paste case as an example -- you'd really want the VS to be left dangling alone with no base character, even though the user selected the &amp;quot;changed&amp;quot; character?&lt;/p&gt;
&lt;p&gt;Or cursor movement, when display works right. You really want the cursor weirdness that would manifest itself, with no other visual indication of what's going on?&lt;/p&gt;
&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9726625" width="1" height="1"&gt;</description></item><item><title>re: UCS-2 to UTF-16, Part 10: Variation[ Selector] on a theme...</title><link>http://blogs.msdn.com/b/michkap/archive/2009/06/10/9723321.aspx#9726505</link><pubDate>Thu, 11 Jun 2009 16:01:32 GMT</pubDate><guid isPermaLink="false">91d46819-8472-40ad-a661-2c78acb4018c:9726505</guid><dc:creator>Andrew West</dc:creator><description>&lt;p&gt;Actually, the result is more likely to look weird with the VS; and if stripped of the VS the result would generally look normal. But, anyway I agree with John that I would not consider the VS and its base character as an inseperable entity like a surrogate pair, and personally like to be able to treat VS's as characters when editing text -- deleting them and trying new different VS's on different characters to get the desired result.&lt;/p&gt;&lt;div style="clear:both;"&gt;&lt;/div&gt;&lt;img src="http://blogs.msdn.com/aggbug.aspx?PostID=9726505" width="1" height="1"&gt;</description></item></channel></rss>