Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
James Brown asked in the microsoft.public.win32.programmer.international and microsoft.public.win32.programmer.gdi newsgroups:
Suppose I have the following two Arabic codepoints: U+0648 "arabic letter waw"U+0650 "arabic letter kasra" These render as a single glyph with Uniscribe When pasted into Notepad, the cursor (and selection highlight) can traverse into the middle of the cluster. When pasted into Wordpad, the cursor _cannot_ move into the middle of these characters. Which is the correct (or desirable) behaviour? Maybe someone can even explain, what significance does it have for the cursor to move into the *middle* of a grapheme cluster - how does the user know which character he/she has selected?? thanks,James
Suppose I have the following two Arabic codepoints:
U+0648 "arabic letter waw"U+0650 "arabic letter kasra"
These render as a single glyph with Uniscribe
When pasted into Notepad, the cursor (and selection highlight) can traverse into the middle of the cluster.
When pasted into Wordpad, the cursor _cannot_ move into the middle of these characters.
Which is the correct (or desirable) behaviour?
Maybe someone can even explain, what significance does it have for the cursor to move into the *middle* of a grapheme cluster - how does the user know which character he/she has selected??
thanks,James
Excellent question, James!
The desirable behavior is what you are describing as the WordPad behavior, to a point. Although if I paste a string of 12 of these pairs of characters (وِوِوِوِوِوِوِوِوِوِوِوِ) into WordPad, it will treat them as a single unit, which is not what I would call desirable. :-(
The Notepad behavior you describe is also not preferred; in all cases other than the BACKSPACE character (for the reasons I describe here), you would want to have movement jump the text element boundaries, which would be those two characters you mentioned....
The bad news is that I can reproduce the behavior you describe in Windows Server 2003 SP1:
The worse news is that I can reproduce the WordPad behavior I describe above in Windows Server 2003 SP1 and XP SP2.
But the good news is that in XP SP2, Notepad behaves correctly and the cursor does not appear in the middle of the character....
In IE6, I currently get the character splitting behavior. You can test out your own browser and version with the textbox below -- put the cursor in and move back and forth to see what happens:
At least products are getting better though (the Vista version of Uniscribe has all of the XP SP2 updates and more!).
This post brought to you by "و" (U+0648, a.k.a. ARABIC LETTER WAW)