Behold the Table Driven Text Service, Part 11 (The knights who say நீ, redux, #1)

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

Behold the Table Driven Text Service, Part 11 (The knights who say நீ, redux, #1)

  • Comments 10

Prior posts in the series:

Okay, we have now gone through a bunch of information on the Table Driven Text Service component and the text files that define the identity and behavior of individual Text Profiles.

So what happens next, exactly?

Well, let's start with the Text Profile I discussed, demo'd, and did not yet give to anyone in And we are the knights who say நீ (NII).

The framework I am using is the same a in that post, plus the feedback in the comments. Like the following for each consonant:

"n"   = "ந்"
"na"  = "
"
"naa" = "
நா
"
"ni"  = "
நி
"
"nii" = "
நீ
"
"nu"  = "
நு
"
"nuu" = "
நூ
"
"ne"  = "
நெ
"
"nee" = "
நே
"
"nai" = "
நை
"
"no"  = "
நொ
"
"noo" = "
நோ
"
"nau" = "
நௌ"

And then the following pure (independent) vowels:

a    அ
aa   ஆ
i    இ
ii    ஈ
u    உ
uu   ஊ
e    எ
ee   ஏ
ai    ஐ
o    ஒ
oo   ஓ
au   ஔ

And the following consonants:

k    க்
ng   ங்
c    ச்
j    ஜ்
ny   ஞ்
tt    ட்
d    ட்
nnn  ண்
th    த்
n    ந்
nn  ன்
p    ப்
f    ஃப்
m   ம்
y   ய்
r    ர்
rr   ற்
l    ல்
L    ள்
zh   ழ்
v    வ்
ss   ஷ்
s    ஸ்
sh   ஶ்
h    ஹ்

But as others have pointed out, this is kind of tedious -- there are some many combinations that really should be handled by using different cases rather than requiring a person to type two vowels.

Now this is currently a limitation in TableTextService.DLL but it may nlot always be -- some future version my address the limitation.

In fact if you look at the Amharic input method in Vista and its text file, you'll see that it mixes upper and lower case on the input side, in anticipation of that limitation being addressed at some point. In the meantime , when you have multiple entries with the same letter differing oinly by case, they will simply both show up in the candidate list.

So what is the principle here, native Tamil speakers? Taking the above lists, which ones would you change the left side entries with, and how?

When I get back all of the rest of the feedback, we'll replace my "based on Unicode character names" input method with one that will perhaps be a bit closer to intuitive!

Then we'll configure the various settings and produce our ideal Tamil input method....

So, any native speakers want to chime in on their replacements? I have tried to do the ones that others suggested in the comments of that post, but I'd like to get them all done -- concentrate on the second and third lists above, noting how

l    ல்
ll    ள்

became

l    ல்
L    ள்

and going from there....

Now of course this is kind of a transliteration keyboard, but it does not have to be if there are keyboards that print just Tami letters on them and we wanted to have this input method match it. Does anyone have such a keyboard? And if so could they take a picture of it?

 

This post brought to you by நீ (U+0ba8 U+0bc0, a.k.a. TAMIL LETTER NA + TAMIL VOWEL SIGN II)

Comment on the blather
Leave a Comment
  • Please add 6 and 2 and type the answer here:
  • Post
Blog - Comment List
  • Well, Tamil actually is a lot more strict in it's consonant clusters - only a small subset of the 18*18 combinations are actually allowed. For instance, when I type in nka, I would like ங்க, since before க, none of the other nasals are allowed.

    Of course, this would be domain-specific and it would limit the IME to generating only legal code-point runs, not everything allowable by the script, but I'd argue IME need to know the language well anyway.

  • That can be solved by making those letters the actual output for those keystrokes --- though of course we'd need them to be given.

    Would you like to try defining a few of those? :-)

    All I really need is:

    1. The corrected letters for the input, and
    2. Exceptions like the one you mention

    And then everything else will be done and ready to download/run on Vista and Server 2008....

  • So while I was in India, I picked up a bunch of books (my suitcase was probably 30 pounds heavier!).

  • Sure, I'd love to help. It's essentially a context-sensitive grammar. It's probably easier to lay out the correct options (exceptions as you term them) than to lay out all the options and rule out the illegal ones.

  • That would be wonderful. :-)

    Of course this is purely table based so none of the more complex like start of word, not at start of word etc. type stuff can be captured...

  • Amazing!!!. I dont know Tamil. I want to learn Tamil. Since I am in chennai from past 3 years but never tried to learn Tamil. From couple I am thinking to lean Tamil. So today I browsed google. I have gone through many sites. I did get any hope from those website. I dont know to speak tamil as well i dont know to write. Never I thought I could get inspired by something where in I will easy.I liked the way you have represented the character. I am finding very beautifully it has been arranged. I salute your brain.

    THanks.

  • I've been blogging about Sinhalese keyboard on and off for some time. Like in November of 2005 when in

  • Prior posts in the series: 0 (You have to start somewhere) 1 (Starting with a dictionary simple in every

  • Prior posts in the series: 0 (You have to start somewhere) 1 (Starting with a dictionary simple in every

  • Prior posts in the series: 0 (You have to start somewhere) 1 (Starting with a dictionary simple in every

Page 1 of 1 (10 items)