Welcome to MSDN Blogs Sign in | Join | Help

CorrecteurOrthographiqueOffice

Nous sommes un groupe de linguistes informaticiens francophones (français, belges, canadiens) de Redmond (USA). Nous développons des outils linguistiques pour Microsoft. Ce blog est pour vous: nous attendons vos commentaires et suggestions. A vos plumes! ~~ We are a group of French-speaking computational linguists from Redmond, WA (USA) working on Microsoft proofing tools. We come from France, Belgium, and Canada. This blog is for you: send us your feedback and suggestions, in English or in French.

Syndication

Archive

Ils en parlent / They talk about these tools...

Interesting links/Liens intéressants

Transliteration Utility freely downloadable

[Version française ici]

Two colleagues from my group (Nick Cipollone and Andrea Jessee) very recently developed a tool called Transliteration Utility which allows you to convert one natural language script to another (like Serbian Latin to Serbian Cyrillic or Latin characters to Inuktitut). The tool, which uses a simple but powerful rule language, can also be used to create, edit, debug, and test your own natural language transliteration modules to convert one script to another.

It can be used either by

   1. Typing in one script in a field, which it will convert on the fly;

   2. Copying and Pasting text in a field, which it will convert automatically;

   3. Giving it a whole Unicode text file to convert;

   4. Converting a list of Unicode files by using its Command Line Interface.

A key feature of the tool is its Module Development Console, which allows anyone to author, edit, and/or test new or existing transliteration modules.

Microsoft Transliteration Utility is freely available for public download at http://www.microsoft.com/globaldev/tools/translit.mspx.

It comes with nine modules ready for use (and you can create your own modules):

Bosnian Cyrillic to Latin

Bosnian Latin to Cyrillic 

Serbian Cyrillic to Latin

Serbian Latin to Cyrillic

Hangul to Romanization

Inuktitut to Romanization

Romanization to Inuktitut

Malayalam to Romanization

Romanization to Malayalam

 

This is really a cool tool or, to say it in Malayalam script, ഠിസ് ഇസ് രെഅല്ല്യ് ചോല്റ്റോല്‍, or in Cyrillic script: Тхис ис реаллy а цоол тоол!

 

Thierry Fontenelle

Microsoft Speech & Natural Language

 

Published Tuesday, February 07, 2006 10:12 PM by OrthoFR

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

# re: Transliteration Utility freely downloadable @ Tuesday, February 21, 2006 12:29 PM

Very good tool indeed...

Leetia Janes

# Extending the MS Transliteration Utility @ Saturday, August 19, 2006 6:28 AM


Regular reader KJK:Hyperion asked in the Suggestion Box:

...when will Transliteration Utility support...

Sorting It All Out

# re: Transliteration Utility freely downloadable @ Sunday, August 20, 2006 7:48 AM

not a big deal or anything.. i just thought it was funny and that id point out.. that the output doesnt really represent what it was input from. cant speak for the cyrillic, but i imagine the same must be true for that too.

if you sound it out it actually sounds like "this is ray ahlly a chole tole"

the reason for this is that the module is following some standard such as ITRANS and all the letters have a standard mapping.. for example, c doesnt go to "ക" like in crow it goes to "ച" like in church.

the word "cool" to be properly transliterated should be input as "kuul"

anyways just thought it was funny...

speaking of the module though.. where is that? id sure like to tweak it to produce more natural transliterations in english script than having random capital letters in the middle of words as ITRANS produces.. such a module would probably be nothing like whats used here, but itd sure be nice to have an example to work from none the less...

what gives, why arent these modules included anywhere? are they embedded as resources in the assembly? actually.. hmm, maybe ill try there next...

dennispg

# Utilitaire de translittération téléchargeable gratuitement @ Wednesday, August 29, 2007 1:04 AM

[ English version here ] Deux collègues de mon groupe ( Nick Cipollone and Andrea Jessee ) viennent tout

CorrecteurOrthographiqueOffice

Leave a Comment

(required) 
required 
(required) 

  
Enter Code Here: Required
Page view tracker