The Microsoft Translator team is very proud to announce the technology preview of an innovative offering for web page translations. Attendees to MIX09 this week get a special invitation to try out the Microsoft Translator web page widget. We are also accepting registrations, and will be sending out more invites as they become available.
What it is: Built on top of the Microsoft Translator AJAX API (also announced today) it is a small, customizable widget that you can place on your web page – and it helps you instantly makes the page available in multiple languages.
What it offers: It provides a simple interface to anyone that visits the web page to select and translate content into a different language. You can see a demo on this page.
What is cool about it:
Fun! What does it cost: It is completely free. You can put it on any site – commercial or non-commercial. You are only limited by the invite codes available at this point, but over the coming months we plan to make it more widely available.
What we are working on:
I can’t get it to work. Where can I get support or provide feedback?
I would like to highlight that this is a technology preview release – so please do test it on your site before presenting to your users. The Microsoft Translator forums are now live. Feel free to head over and interact with other users. You will also find members of our team there who can help.
Can this save me the cost of doing human translation on my professional website?
Our goal (and that of most machine translation systems available today) is to provide what we call “useful” translations. While the technology is improving month to month, it will still take a long time before it can match human translation quality. We don’t recommend using machine translation for sensitive or highly critical information. You can learn more about translation quality here and here. You can learn more about how we do machine translation here.
How many languages do you support? When can you add support for <insert language here>?
Currently we support the following languages.
· Chinese (Simplified & Traditional)
Polish was our most recent addition. Our goal is to keep adding languages as we get enough training data to meet our minimum (“useful”) quality criteria which include both standard measurements and human evaluations.
We are looking to work with providers of hosted services to make adding the widget an easy process for their users. If your provider does not offer this, please let them and us know that you would like to see the widget work with your site.
Keep checking this post and our forums for announcements, known issues and more information. You can follow our MIX09 coverage on twitter and on Vikram’s blog.
Last Updated: 3/18/2009, 4:15 PM
The Microsoft Translator team is sponsoring MIX 09 this year and we will be showing off the new web page translation widget and the translator APIs. If you are attending MIX, come to our session!
All attendees go widget invite codes in the bags. At this session you will see how you can make use of them.
MIX09-B05M Exposing Web Content to a Global Audience Using Machine Translation. San Polo 3401 | Thursday March 19 |1:25 PM-1:45 PM
We will also be at the Live Search session.
MIX09-T33F Customized Live Search for Web and Client Applications. Delfino 4001| Thursday March 19 |1:00 PM-2:15 PM |
We will also be giving out the exclusive API invite codes at these sessions – so make sure to be there!
Keep an eye on this blog for further news and tidbits from MIX.
- Vikram Dendi Business Strategy & Front End Program Management Microsoft Translator
We are happy to announce the release of our English <-> Polish translation engine! Some of you have been asking us about Polish translation, and we’re excited to deliver for you! We’ve also made some quality improvements to our Spanish, French, German, and Italian engines – if you’ve tried these languages out in the past, give them another try and let us know what you think.
One more thing I wanted to mention - we try out various features from time to time, as part of our commitment to offer new ways for our users to take advantage of translation. You might notice that the link to get a professional translation has been taken down as part of our current release. We launched this service back in June 2008. This experimental feature has run its course (for now) and we are evaluating the experience to see how we could make it better. Stay tuned to this blog for updates, and of course your feedback is always appreciated!
Anand Chakravarty is an SDET on the Machine Translation team for the past 2.5 years, has been at Microsoft for 8 years, and was the first product tester on the MT team (and “still having fun with testing MT :-)”). Today’s guest blog is about testing translation quality.
One of the first points that comes to mind, when talking about verifying the quality of a translation system, is how do you measure the quality, or to be precise, the accuracy of translation? Translating between human languages using computers is a field that is almost half-a-century old. The area is challenging enough that even the best currently available machine translation systems are not close to obtaining linguistic quality that would be entirely satisfactory.
Part of the challenge is the many different data-points that humans process in order to understand the meaning of spoken/written text. There is the syntax, the parsing, the semantics, the context, the disambiguation, the reordering, all of which, and more, go into understanding a sentence. And this is just the sentence in 1 language. Now consider applying all of it to rebuild the sentence in another language and make it equally meaningful.
Some examples might help to make this point clearer. The term ‘Olympics 2008’ is fairly unambiguous. Similarly, one might expect the term ‘Elections 2008’ to mean the presidential elections in the USA. However, if the user is from, say, Canada, it would more likely refer to the local elections there.
A more general, and hence more common, example is a sentence like ‘The note was wrong’. Is the word ‘note’ a reference to an informative message or to a musical term? The proper translation depends upon context. Use more context, and your chances of getting a more accurate translation improve. This however comes at a cost: the more context the system tries to obtain, the slower its performance. Smart shipping decisions involve making the right balance between improving the accuracy of translation and delivering a workable translation result to users. Of course, both are important. The key is to understand where you direct efforts at improvement depending on how useful the end result is to the user.
This becomes particularly interesting when translating documents or web-pages, instead of just individual sentences. Let us say a translation request has been received for a web-page containing 100 sentences. Depending on the architecture of the translation system, these sentences could all go to one process, or be distributed across multiple processes/machines. Either way, it is clear that the time taken to translate this page in its entirety is proportional to the maximum time taken to translate a sentence. How long do we spend translating a sentence before that invested time becomes detrimental to the user’s time? In pursuit of the best translation, we might end up blocking the user from getting anything informative in response to their translation request. The utility of the system is thus governed by decisions that are made to balance linguistic quality and application performance.
With the Microsoft Translator product, there is the additional feature of our Bilingual Viewer, something unique among publically available translation products. It supports parallel text highlighting, synchronized scrolling and presents the page(s) with progressive rendering. This adds another layer to what our users see, and consequently another layer to polish and finish.
In the coming weeks, we hope to bring you more details of specific areas that were and are being tested to ship a top-quality translation system. Feel free to post any questions you have on this matter, something you always wanted to ask :-), in the Comments section.
Lee Schwartz is a Computational Linguist on the Microsoft Translator team. Today’s guest blog is about getting lost in (machine) translation…
Recently, a user seemed upset with the translation he received for a metal paint can. No wonder. When he translated this into Spanish, he got un metal pintura puede, which means a metal paint is able to. And, what is that supposed to mean? But, then again, what is "meaning" to a machine translation system anyway? Does anything mean anything? Or, is the computer just seeing words in combination in one language and corresponding words in another language? And is it assuming that because one sequence is used in the source language when another is used in the target, one is the translation of another? Even if the machine translation program is just seeing words in combination, wouldn't it have seen paint can before and know that the can in this context is some kind of container? Then, again, can you be sure that the computer behind the MT program knows anything about paint cans, or has seen those two words in combination? Why do you think it would have? But, giving it the benefit of the doubt, and assuming it knows all about paint cans, or at least has seen the string paint can a lot, how is it supposed to know how to translate a metal paint can? Maybe the computer has seen something like The metal film on one side of the plate... may be obtained by ...spraying a metal paint or ....
Ah ha! So there really are metal paints. And, if there are metal paints, why can't a metal paint can be the answer to a metal paint can, can't it? Well, it is just not likely that when you have the words paint and can in sequence, that can means be able to. But then again it is just not likely that can means anything but be able to. I guess we can say things and think things that are just not likely. I can easily understand what A metal paint can can, can't it? means. The computer might just think that I inadvertently typed can twice. Certainly, if it learns from real data, say from the Web, it will see can can a lot. Maybe that is why it won't translate He did the can can correctly. But really, what is English doing with so many types of cans anyway? We can even can worms, but we won’t open that one now.