Welcome to MSDN Blogs Sign in | Join | Help

Updating the Haitian Creole Translation system

Most of you know that we released the first publicly available Haitian Creole statistical machine translation engine last week and have been hard at work making it even better. I am pleased to announce since last night we rolled out two updates to the system and our site which bring several improvements:image

1) More training data = better translations. We trained the system on even more training data (including data that we hand translated) which should reflect in better translations. We are nowhere near done yet, and we will continue to work on this.

2) Updating the AJAX API and widget. The Translator widget (and the underlying AJAX API) now accurately reflect “Haitian Creole” as the language selected in their UI. This was primarily a user interface fix (the Haitian Creole translation itself worked fine). You can use the widget to deliver any webpage in any of the languages we support (including Haitian Creole).

3) Please don’t forget the broad set of APIs and webmaster resources that are available for those that are building applications and websites to help with the relief efforts. There are several efforts underway to develop mobile apps (using the SOAP or HTTP API) and websites (using the AJAX API). If you are working on something along those lines, leave a link to your app/site in the comments and I will make sure to surface them up here so people can find them more easily.

We will continue to work on improving the system and we wish to thank everyone in the community that has been instrumental in helping us get this much requested translation engine out of the door. Stay tuned for more announcements!

Also, let me once again point to a resource where you can help with the broader Haiti relief efforts. Please help in any way you can!

Update (1/31): The DIPLOMAT project at CMU in the 1990s was an earlier project to create a Haitian Creole system for DOD/DARPA. As I mentioned in our earlier blog post, our system makes use of CMU’s data from that project.

- Vikram Dendi, Senior Product Manager, Microsoft Translator

Posted by MSR-MT Team | 5 Comments

Announcement: Haitian Creole support in Bing Translator and other Microsoft Translator powered services

In the current crisis in Haiti there are a number of initiatives to rapidly build software to assist in humanitarian aid. Responding to community requests for a machine translation (MT) system to translate between English and Haitian Creole, our team has been hard at work over the last few days. I am glad to announce that an experimental Haitian Creole MT system is now publicly available via several services and APIs powered by Microsoft Translator technologies. We will continue working on improving the system, but we hope meanwhile that in spite of the experimental nature – it will be of use in the relief efforts.

 IE8 Accelerator Kreyol

1) What is being announced today?

Responding to requests from the community involved in Haitian relief efforts, Microsoft Research is making available today an experimental machine translation system for translating to and from Haitian Creole. You can try it at http://translate.bing.com or http://www.microsofttranslator.com

2) How is it significant?

With the devastating disaster that struck Haiti, we have all been individually pitching in to help the efforts. This is our effort, as a team, to respond to the needs of communities such as Crisis Commons by delivering a Haitian Creole translator which can be of help to individual users, as well as other technology projects that could use a scalable translation system in their relief endeavors. Further, the usage of our API is completely free and it can be built into any application or website for immediate use. We hope that this might help the many applications being developed (such as those on crisiscommons.org) to aid the humanitarian efforts.

3) How can I use this system?

The Haitian Creole translator is now part of the Microsoft Translator web service enabling many of the user scenarios powered by the service. Users can access the service through the Microsoft Translator web site.  Developers would be interested in looking at our APIs – and choose from SOAP or HTTP (Support for Haitian in our AJAX API will be rolled out in the coming days).

4) How is it different from other efforts?

There have been some great efforts in quickly building dictionary and rule-based Haitian Creole translation tools. The statistical machine translation system behind Microsoft Translator allows for a continuous improvement in the quality of translations (by adding more training data). Also, by delivering this as part of our web service we can ensure scale and performance and open up the possibility of using our many scenarios (Bing Translator, Internet Explorer 8, Messenger Bot etc.) with Haitian Creole, as well as using our extensive API set to add such support to other software and web sites at no cost.

5) What was involved in getting this out of the door in record time?

The process involved identifying parallel (translated) data between English and Haitian Creole, and training the MT engine to create the requisite language models. We would also like to acknowledge the great work being done the Crisis Commons folks, the dictionary builders at haitisurf.com, the folks at CMU that made available parallel data and the Microsoft volunteers who challenged our team to action.

6) What should I expect in terms of quality?

This is an experimental system put together in record time. While our typical approach to adding new languages involves significantly larger amounts of training, a higher threshold for quality testing – we decided that the upside warranted making the system available to the community at the earliest, and continue improving it subsequently. We are working diligently to keep improving the quality, but bear with us if you encounter problems. You can always contact us at mtcont@microsoft.com with feedback. Our user and developer forums are also available to discuss any issues you encounter.

7) How can I help improve the system?

The best way you can help improve the system is by helping us find more training data. This is typically sentences or words translated between English and Haitian Creole. We intend to make available to the larger community (via tausdata.org) data that we collect (as license restrictions permit) for training purposes. If you know of dictionaries, translated sentences, or websites that have such translations we urge you to contribute it to TDA’s TAUS data sharing initiative. TDA is a non-profit organization providing a neutral and secure platform for sharing language data. If you have any concerns or questions feel free to contact us at mtcont@microsoft.com.

8) How can I help the broader Haiti relief efforts?

Go here to learn more about how you can help those devastated by the earthquake.

9) Where can I get more information?

Please stay tuned to our blog for further announcements. You can learn more about Microsoft Translator and the services we offer here.

10) What can we expect next?

In the coming days expect to see support for Haitian Creole added to even more of our scenarios (Translation Bot, Translator widget, Office etc) as well as the AJAX API. Known issues and announcements can also be found on our forums.

We hope that this contribution proves useful to the various humanitarian efforts underway, and please stay tuned to this blog for further news on the Haitian Creole language support. If you have any questions feel free to contact us at mtcont@microsoft.com.

Update (2:53 PM PST): The Messenger Translation Bot can now speak Haitian Creole. Add mtbot@hotmail.com to your messenger buddy list. Try the group conversation feature with a Kreyol speaker!

-Vikram Dendi, Senior Product Manager, Microsoft Translator

Posted by MSR-MT Team | 8 Comments

TechEd Europe: Microsoft Translator widget and APIs in beta

Hallo aus Berlin!

Thanks to the great feedback from the early adopters of the Microsoft Translator widget and APIs, we are pleased to remove the invite requirement and move the widget and APIs to public beta. Anyone can now generate a snippet for their site or application from the widget and AJAX API adoption portals.

techedbadgeLikewise, we are also pleased to announce the availability of API (SOAP and HTTP) licensing terms for commercial applications. Feel free to email mtlic@microsoft.com for more information. While in beta, there is no charge for commercial use of the API. The widget and AJAX API continue to be free for commercial use under the standard terms of use.TechEdEurope

Thank you for all those who attended today’s session at TechEd Europe. Here is a recap:

  • Microsoft Translator APIs and the webpage widget are now in beta
  • Generate a translator widget for your webpage here, or use the AJAX API to further customize the translation experience
  • Detailed reference for the APIs on MSDN, Getting started guides for ASP.NET and PHP, Interactive SDK
  • 20+ languages supported by the service now (the latest 2 to be added in the next few days – Finnish and Bulgarian)
  • New to windows 7 and Internet Explorer 8? Try the “Translate with Bing” accelerator powered by Microsoft Translator!
  • Microsoft Translator powers millions of translations each day for Office (2003-2010), Bing, Live Toolbar and the unique Messenger bot
  • Commercial licenses for the SOAP and HTTP APIs are available at no cost. Contact mtlic@microsoft.com for more details

We also greatly appreciate all the great feedback on what languages you would like to see, and we hope to satisfy many of the requests within the next few months. Stay tuned and keep the feedback coming!

-Vikram Dendi

Microsoft Translator

Posted by MSR-MT Team | 9 Comments

Twenty is a nice round number – say ยินดีต้อนรับ (welcome) to our newest release!

In my last update I had asked about what languages you wanted Microsoft Translator service to support. Thank you for taking the time to respond. We are pleased to announce that last week we added Czech (CSY), Danish (DAN), Greek (ELL), Swedish (SVE) and Thai (THA), taking our language count to a nice round 20. image

Here is the complete list as of today: image

  • ARA – Arabic
  • CHS - Chinese Simplified
  • CHT - Chinese Traditional
  • NLD - Dutch
  • ENU - English
  • FRA - French
  • DEU - German
  • HEB - Hebrew
  • ITA - Italian
  • JPN - Japanese
  • KOR - Korean
  • PLK - Polish
  • PTB - Portuguese
  • RUS - Russian
  • ESN - Spanish
  • CSY - Czech
  • DAN - Danish
  • ELL - Greek
  • SVE - Swedish
  • THA - Thai

You will be able to translate between these languages in all Microsoft Translator powered services including Bing Translator, Internet Explorer Accelerator, Office, Widget as well as in our APIs. Feel free to send in your feedback on the new languages via the forum. We do keenly follow your recommendations and requests as we prioritize new languages – so please do keep them coming in the comments section!

-Vikram Dendi

Microsoft Translator

Hebrew support is here. What do you want to see next?

שלום לעולם!

I am pleased to announce that we just added Hebrew to the list of languages that we support. You can immediately use it in Bing Translator, in IE8, with the widget, with the messenger bot, inside Office and of course with the API.

 bing

I would like to congratulate our language quality and coverage team on the progress they have been making with new languages. Over the next few months you will see more languages added to the mix, and also continue to see quality improvements for existing languages. Feel free to leave a comment on this thread about any languages you would particularly like to see.

We have also had many of you contacting us about helping find data sources that can be useful to train the machine translation system on – we appreciate your help! Our email address is mtcont@microsoft.com. Do stay in touch!

-Vikram

Posted by MSR-MT Team | 41 Comments

Microsoft Translator Instant Answers Now On Bing

Use Bing to instantly translate queries from one language to another with our translation Instant Answer!  Starting today, when you are looking for a translation of a word or phrase, go to Bing.com and kick off an instant translation, powered by Microsoft Translator.  Instant translation is another way that Bing helps you complete tasks faster by presenting better organized and more relevant content.

What to expect?

Example query: translate I love you

Bing returns:

clip_image001

Example query: translate I love you to Japanese

Bing returns:

clip_image002

Example query: how do you say apple juice in Spanish

Bing returns:

clip_image003

Enjoy!!

Cheers,

Lane Rau

Microsoft Translator

Posted by MSR-MT Team | 28 Comments

Any-to-Any Translations and Language Autodetect now available for Microsoft Translator

Today our team released some exciting updates for Microsoft Translator! It is now possible to translate from any of our languages to any other language. Spanish to Chinese? Arabic to German? Check :)

We have also added a Language Autodetect feature to our webpage translator. So if you’re on a page that’s in a language you don’t recognize, our translation engine will autodetect the language on the source page, and automatically start translating into the language of your choice.

We’ve also cleaned up our landing page to try and make it a bit easier to use. Let us know what you think!

In addition, some of you noticed that there was a bug for some PC configurations with IE8. There was a bug where the IE8 Accelerator did not remember the user’s selected language. With this release, the bug has been fixed. Thanks to those of you who caught the bug and let us know!! An added bonus: you can also translate from any language to any other language in the preview pane!

Translation Accelerator

Download the Microsoft Translator installer for Microsoft Office

Now you can translate your Microsoft Office documents with Microsoft Translator – right within Office! You can translate words, phrases, or even your entire document, through the Research task pane. We blogged about setting this up manually for Office 2007 or Office 2003 previously - now it's really easy!

 

This works for both Microsoft Office 2003 and 2007. The current default in Microsoft Office is WorldLingo – this installer will update your task pane to use Microsoft Translator as the default translator for the languages we provide.

 

Download the installer now and let us know what you think over in the Forum!
Posted by MSR-MT Team | 44 Comments
Filed under:

Silk Road Power Trio: New API features, Microsoft Translator, PowerToys CodePlex Launch

Our friends over in Live Search featured Microsoft Translator in their MIX09 blog post - check it out!

 

Posted by MSR-MT Team | 6 Comments

Announcing the Microsoft Translator web page widget

The Microsoft Translator team is very proud to announce the technology preview of an innovative offering for web page translations. Attendees to MIX09 this week get a special invitation to try out the Microsoft Translator web page widget. We are also accepting registrations, and will be sending out more invites as they become available.image

What it is: Built on top of the Microsoft Translator AJAX API (also announced today) it is a small, customizable widget that you can place on your web page – and it helps you instantly makes the page available in multiple languages.

Who it is for: Anyone with a web page. If you can paste a small snippet of code into your page, you will be able to display the widget to your audience. No need to know programming intricacies, or how to call a javascript API. No need to write or install server side plug-ins for your specific software. 

What it offers: It provides a simple interface to anyone that visits the web page to select and translate content into a different language. You can see a demo on this page.

What is cool about it:

  • Innovative: Unlike other (including our) existing solutions, it does not take the users away from the site. The translations are in-place and instant. Users can hover over the translation to see the original. image
  • Easy to Use: Adding it to your page is as easy as copy and paste. Using it on the site is as easy as select language and click the button.
  • Customizable: You can pick the colors that best blend into your site design. You can pick the size that would best fit into your design (in fact the widget has an adaptive layout that better uses real estate when very wide). image
  • Thoughtful User Experience: Progressive rendering allows for the page to get translated progressively – without having the user stare at a white space while the translation is being performed. The translation toolbar that appears when the translation is kicked off provides a progress indicator, the languages selected and a way to turn off the translation.  
  • Localized: The UI is available in multiple languages – so users that come to your page with their browser set to a different language will see the widget in their language. 

Fun! What does it cost: It is completely free. You can put it on any site – commercial or non-commercial. You are only limited by the invite codes available at this point, but over the coming months we plan to make it more widely available.

What we are working on:

  • More polish: We will be looking for your feedback and continue to work on the fit and finish for the widget & toolbar UI.
  • More customizability: We will be evolving the default color palette available to you through the adoption portal. We will also be looking at your feedback on the overall design.
  • New Features: There are a bunch of very cool features that we are working on that will be added soon (your widgets will inherit most of these features). These include “Automatic” translations on page load, multiple layouts/views (bringing in the well received views feature of our bi-lingual viewer offering) and some surprises that we are working on with other teams at Microsoft.

Other questions:

I can’t get it to work. Where can I get support or provide feedback?

I would like to highlight that this is a technology preview release – so please do test it on your site before presenting to your users. The Microsoft Translator forums are now live. Feel free to head over and interact with other users. You will also find members of our team there who can help.

Can this save me the cost of doing human translation on my professional website?

Our goal (and that of most machine translation systems available today) is to provide what we call “useful” translations. While the technology is improving month to month, it will still take a long time before it can match human translation quality. We don’t recommend using machine translation for sensitive or highly critical information. You can learn more about translation quality here and here. You can learn more about how we do machine translation here.

How many languages do you support? When can you add support for <insert language here>?

Currently we support the following languages.

· Arabic

· Chinese (Simplified & Traditional)

· Dutch

· French

· German

· Italian

· Japanese

· Korean

· Polish

· Portuguese

· Russian

· Spanish

Polish was our most recent addition. Our goal is to keep adding languages as we get enough training data to meet our minimum (“useful”) quality criteria which include both standard measurements and human evaluations.

I am using a hosted service for my site/blog that does not let me use javascript widgets. What can I do?

We are looking to work with providers of hosted services to make adding the widget an easy process for their users. If your provider does not offer this, please let them and us know that you would like to see the widget work with your site.

Keep checking this post and our forums for announcements, known issues and more information. You can follow our MIX09 coverage on twitter and on Vikram’s blog.

Last Updated: 3/18/2009, 4:15 PM

Posted by MSR-MT Team | 48 Comments
Filed under:

Announcements and Sessions at MIX09(Updated)

The Microsoft Translator team is sponsoring MIX 09 this year and we will be showing off the new web page translation widget and the translator APIs. If you are attending MIX, come to our session!

All attendees go widget invite codes in the bags. At this session you will see how you can make use of them.

MIX09-B05M Exposing Web Content to a Global Audience Using Machine Translation. San Polo 3401 | Thursday March 19 |1:25 PM-1:45 PM

We will also be at the Live Search session.

MIX09-T33F Customized Live Search for Web and Client Applications. Delfino 4001| Thursday March 19 |1:00 PM-2:15 PM |

We will also be giving out the exclusive API invite codes at these sessions – so make sure to be there!

Keep an eye on this blog for further news and tidbits from MIX.

- Vikram Dendi
Business Strategy & Front End Program Management
Microsoft Translator

Posted by MSR-MT Team | 5 Comments

Polish now available on MicrosoftTranslator.com

We are happy to announce the release of our English <-> Polish translation engine!  Some of you have been asking us about Polish translation, and we’re excited to deliver for you!  We’ve also made some quality improvements to our Spanish, French, German, and Italian engines – if you’ve tried these languages out in the past, give them another try and let us know what you think. 

One more thing I wanted to mention - we try out various features from time to time, as part of our commitment to offer new ways for our users to take advantage of translation.  You might notice that the link to get a professional translation has been taken down as part of our current release. We launched this service back in June 2008.  This experimental feature has run its course (for now) and we are evaluating the experience to see how we could make it better. Stay tuned to this blog for updates, and of course your feedback is always appreciated!

Posted by MSR-MT Team | 9 Comments

Testing translation quality: Guest Blog

Anand Chakravarty is an SDET on the Machine Translation team for the past 2.5 years, has been at Microsoft for 8 years, and was the first product tester on the MT team (and “still having fun with testing MT :-)”).  Today’s guest blog is about testing translation quality. 

---------------------------------------------------------------------------------------------------------------

One of the first points that comes to mind, when talking about verifying the quality of a translation system, is how do you measure the quality, or to be precise, the accuracy of translation? Translating between human languages using computers is a field that is almost half-a-century old. The area is challenging enough that even the best currently available machine translation systems are not close to obtaining linguistic quality that would be entirely satisfactory.

Part of the challenge is the many different data-points that humans process in order to understand the meaning of spoken/written text. There is the syntax, the parsing, the semantics, the context, the disambiguation, the reordering, all of which, and more, go into understanding a sentence. And this is just the sentence in 1 language. Now consider applying all of it to rebuild the sentence in another language and make it equally meaningful.

Some examples might help to make this point clearer. The term ‘Olympics 2008’ is fairly unambiguous. Similarly, one might expect the term ‘Elections 2008’ to mean the presidential elections in the USA. However, if the user is from, say, Canada, it would more likely refer to the local elections there.

A more general, and hence more common, example is a sentence like ‘The note was wrong’. Is the word ‘note’ a reference to an informative message or to a musical term? The proper translation depends upon context. Use more context, and your chances of getting a more accurate translation improve. This however comes at a cost: the more context the system tries to obtain, the slower its performance. Smart shipping decisions involve making the right balance between improving the accuracy of translation and delivering a workable translation result to users. Of course, both are important. The key is to understand where you direct efforts at improvement depending on how useful the end result is to the user.

This becomes particularly interesting when translating documents or web-pages, instead of just individual sentences. Let us say a translation request has been received for a web-page containing 100 sentences. Depending on the architecture of the translation system, these sentences could all go to one process, or be distributed across multiple processes/machines. Either way, it is clear that the time taken to translate this page in its entirety is proportional to the maximum time taken to translate a sentence. How long do we spend translating a sentence before that invested time becomes detrimental to the user’s time? In pursuit of the best translation, we might end up blocking the user from getting anything informative in response to their translation request. The utility of the system is thus governed by decisions that are made to balance linguistic quality and application performance.

With the Microsoft Translator product, there is the additional feature of our Bilingual Viewer, something unique among publically available translation products. It supports parallel text highlighting, synchronized scrolling and presents the page(s) with progressive rendering. This adds another layer to what our users see, and consequently another layer to polish and finish.

In the coming weeks, we hope to bring you more details of specific areas that were and are being tested to ship a top-quality translation system. Feel free to post any questions you have on this matter, something you always wanted to ask :-), in the Comments section.

Posted by MSR-MT Team | 12 Comments

A metal can (can’t it?): Guest Blog

Lee Schwartz is a Computational Linguist on the Microsoft Translator team.  Today’s guest blog is about getting lost in (machine) translation…

-----------------------------------------------------------------------------------

Recently, a user seemed upset with the translation he received for a metal paint can.  No wonder.  When he translated this into Spanish, he got un metal pintura puede, which means a metal paint is able to.  And, what is that supposed to mean?  But, then again, what is "meaning" to a machine translation system anyway?  Does anything mean anything?  Or, is the computer just seeing words in combination in one language and corresponding words in another language?  And is it assuming that because one sequence is used in the source language when another is used in the target, one is the translation of another?  Even if the machine translation program is just seeing words in combination, wouldn't it have seen paint can before and know that the can in this context is some kind of container?   Then, again, can you be sure that the computer behind the MT program knows anything about paint cans, or has seen those two words in combination?  Why do you think it would have?  But, giving it the benefit of the doubt, and assuming it knows all about paint cans, or at least has seen the string paint can a lot, how is it supposed to know how to translate a metal paint can?   Maybe the computer has seen something like The metal film on one side of the plate...  may be obtained by ...spraying a metal paint or ....  

Ah ha!  So there really are metal paints.  And, if there are metal paints, why can't a metal paint can be the answer to a metal paint can, can't it?  Well, it is just not likely that when you have the words paint and can in sequence, that can means be able to.  But then again it is just not likely that can means anything but be able to.  I guess we can say things and think things that are just not likely.  I can easily understand what A metal paint can can, can't it? means.  The computer might just think that I inadvertently typed can twice.  Certainly, if it learns from real data, say from the Web, it will see can can a lot.  Maybe that is why it won't translate He did the can can correctly.  But really, what is English doing with so many types of cans anyway?  We can even can worms, but we won’t open that one now.   

Posted by MSR-MT Team | 8 Comments

Translation User Experience: Guest Blog

Andrea Jessee is the Senior Program Manager on the Microsoft Translator team in charge of the user experience.  Today's guest blog is how the team thinks about user experience with translation. 

Creating a better user experience

We have shown the suite of Microsoft Translation services at various shows and tech events. The number one question we get is: Show me how it translates <some interesting example sentence>. Sometimes we do well, other times the system behavior is (probably) as expected: meaning – we choke on the (possibly highly ambiguous) sentence and produce something funny. We know that the hard problem of Machine Translation has not been solved yet. We are working tirelessly on translation quality improvements and expansion, but it remains a hard nut to crack – for anyone in the field. Why – if we know it – don’t we wait for the major break-through instead of releasing a service that is far from perfect? The answer is simple: We recognize the growing need for such a service. In this era of the ever-expanding internet which is blissfully ignoring any geographic borders, in times where information retrieval must cross language boundaries to ensure access to the bigger picture, in recognition of the fact that English is a dominating language in our “world-wide-web”, we simply must respond to the resulting needs today. And so we do … like other respectable providers in the field, we offer a free service to the best of our current technological and scientific abilities.

We take it one step further

In addition to our investment in the core translation technology, the Microsoft Translator team has spent a significant amount of effort on the creation of a user experience which acknowledges and mitigates current limitations of raw translation quality, maximizing its usefulness to our users. This is especially highlighted in our distinguished Bilingual Viewer: Its commitment to provide ease of access to original and translation language and its one-click views customizations, all enhanced by parallel highlighting, synchronized scrolling and navigation functions has received raving reviews.

A user-friendly UI concept is only one of our approaches to bridge the gap between a current need for our service and the current limitations. In focus groups we have learned that ease of access to our service is being expected from a wide range of other Microsoft properties. Hence, a seamless integration into other communication and authoring tools became a vital part of our mission to create a better user experience for the consumers of o http://gallery.live.com/default.aspx?pl=3ur translation service.

The Windows Live Translator toolbar button gives immediate access to the Bilingual Viewer experience from wherever you are on the web. Our friendly Translation Bot TBot can either translate text for you using the Windows Live Messenger, or serve as your personal chat interpreter between you and your international buddies. Internet Explorer 8 has the translation service right built into its Accelerators, offering text or full page translations with as much as a mouse hover or click. If you wish to use our translation service directly from Office Word, you can do this today without the need to wait for a new Office release. And yes – full document translation is delivered in bilingual view. Of course, the same functionality is also available to you in Office Outlook, if you have chosen to display it in Office Word mode. We also would like site owners to benefit from our offering if they’d like to make their pages available with free translations. A simple copy/paste action is all it takes to Add our web page translator to your site.

In further acknowledgement of the limitations of machine translation, we also offer a direct link to an affordable professional on-demand translation team, delivering human translations sometimes in a matter of hours.

And we are not done yet … please stay tuned for new releases of more and better features, experiences, and integration scenarios. And please do continue to post your wishes to our blog. We read all your posts carefully and will factor in your feedback in the planning of our next design steps…

Thank you!

Andrea Jessee, MSR-MT User Experience PM

Posted by MSR-MT Team | 9 Comments
More Posts Next page »
 
Page view tracker