We are happy to announce the release of our English <-> Polish translation engine! Some of you have been asking us about Polish translation, and we’re excited to deliver for you! We’ve also made some quality improvements to our Spanish, French, German, and Italian engines – if you’ve tried these languages out in the past, give them another try and let us know what you think.
One more thing I wanted to mention - we try out various features from time to time, as part of our commitment to offer new ways for our users to take advantage of translation. You might notice that the link to get a professional translation has been taken down as part of our current release. We launched this service back in June 2008. This experimental feature has run its course (for now) and we are evaluating the experience to see how we could make it better. Stay tuned to this blog for updates, and of course your feedback is always appreciated!
Anand Chakravarty is an SDET on the Machine Translation team for the past 2.5 years, has been at Microsoft for 8 years, and was the first product tester on the MT team (and “still having fun with testing MT :-)”). Today’s guest blog is about testing translation quality.
One of the first points that comes to mind, when talking about verifying the quality of a translation system, is how do you measure the quality, or to be precise, the accuracy of translation? Translating between human languages using computers is a field that is almost half-a-century old. The area is challenging enough that even the best currently available machine translation systems are not close to obtaining linguistic quality that would be entirely satisfactory.
Part of the challenge is the many different data-points that humans process in order to understand the meaning of spoken/written text. There is the syntax, the parsing, the semantics, the context, the disambiguation, the reordering, all of which, and more, go into understanding a sentence. And this is just the sentence in 1 language. Now consider applying all of it to rebuild the sentence in another language and make it equally meaningful.
Some examples might help to make this point clearer. The term ‘Olympics 2008’ is fairly unambiguous. Similarly, one might expect the term ‘Elections 2008’ to mean the presidential elections in the USA. However, if the user is from, say, Canada, it would more likely refer to the local elections there.
A more general, and hence more common, example is a sentence like ‘The note was wrong’. Is the word ‘note’ a reference to an informative message or to a musical term? The proper translation depends upon context. Use more context, and your chances of getting a more accurate translation improve. This however comes at a cost: the more context the system tries to obtain, the slower its performance. Smart shipping decisions involve making the right balance between improving the accuracy of translation and delivering a workable translation result to users. Of course, both are important. The key is to understand where you direct efforts at improvement depending on how useful the end result is to the user.
This becomes particularly interesting when translating documents or web-pages, instead of just individual sentences. Let us say a translation request has been received for a web-page containing 100 sentences. Depending on the architecture of the translation system, these sentences could all go to one process, or be distributed across multiple processes/machines. Either way, it is clear that the time taken to translate this page in its entirety is proportional to the maximum time taken to translate a sentence. How long do we spend translating a sentence before that invested time becomes detrimental to the user’s time? In pursuit of the best translation, we might end up blocking the user from getting anything informative in response to their translation request. The utility of the system is thus governed by decisions that are made to balance linguistic quality and application performance.
With the Microsoft Translator product, there is the additional feature of our Bilingual Viewer, something unique among publically available translation products. It supports parallel text highlighting, synchronized scrolling and presents the page(s) with progressive rendering. This adds another layer to what our users see, and consequently another layer to polish and finish.
In the coming weeks, we hope to bring you more details of specific areas that were and are being tested to ship a top-quality translation system. Feel free to post any questions you have on this matter, something you always wanted to ask :-), in the Comments section.
Lee Schwartz is a Computational Linguist on the Microsoft Translator team. Today’s guest blog is about getting lost in (machine) translation…
Recently, a user seemed upset with the translation he received for a metal paint can. No wonder. When he translated this into Spanish, he got un metal pintura puede, which means a metal paint is able to. And, what is that supposed to mean? But, then again, what is "meaning" to a machine translation system anyway? Does anything mean anything? Or, is the computer just seeing words in combination in one language and corresponding words in another language? And is it assuming that because one sequence is used in the source language when another is used in the target, one is the translation of another? Even if the machine translation program is just seeing words in combination, wouldn't it have seen paint can before and know that the can in this context is some kind of container? Then, again, can you be sure that the computer behind the MT program knows anything about paint cans, or has seen those two words in combination? Why do you think it would have? But, giving it the benefit of the doubt, and assuming it knows all about paint cans, or at least has seen the string paint can a lot, how is it supposed to know how to translate a metal paint can? Maybe the computer has seen something like The metal film on one side of the plate... may be obtained by ...spraying a metal paint or ....
Ah ha! So there really are metal paints. And, if there are metal paints, why can't a metal paint can be the answer to a metal paint can, can't it? Well, it is just not likely that when you have the words paint and can in sequence, that can means be able to. But then again it is just not likely that can means anything but be able to. I guess we can say things and think things that are just not likely. I can easily understand what A metal paint can can, can't it? means. The computer might just think that I inadvertently typed can twice. Certainly, if it learns from real data, say from the Web, it will see can can a lot. Maybe that is why it won't translate He did the can can correctly. But really, what is English doing with so many types of cans anyway? We can even can worms, but we won’t open that one now.
Andrea Jessee is the Senior Program Manager on the Microsoft Translator team in charge of the user experience. Today's guest blog is how the team thinks about user experience with translation.
Creating a better user experience
We have shown the suite of Microsoft Translation services at various shows and tech events. The number one question we get is: Show me how it translates <some interesting example sentence>. Sometimes we do well, other times the system behavior is (probably) as expected: meaning – we choke on the (possibly highly ambiguous) sentence and produce something funny. We know that the hard problem of Machine Translation has not been solved yet. We are working tirelessly on translation quality improvements and expansion, but it remains a hard nut to crack – for anyone in the field. Why – if we know it – don’t we wait for the major break-through instead of releasing a service that is far from perfect? The answer is simple: We recognize the growing need for such a service. In this era of the ever-expanding internet which is blissfully ignoring any geographic borders, in times where information retrieval must cross language boundaries to ensure access to the bigger picture, in recognition of the fact that English is a dominating language in our “world-wide-web”, we simply must respond to the resulting needs today. And so we do … like other respectable providers in the field, we offer a free service to the best of our current technological and scientific abilities.
We take it one step further
In addition to our investment in the core translation technology, the Microsoft Translator team has spent a significant amount of effort on the creation of a user experience which acknowledges and mitigates current limitations of raw translation quality, maximizing its usefulness to our users. This is especially highlighted in our distinguished Bilingual Viewer: Its commitment to provide ease of access to original and translation language and its one-click views customizations, all enhanced by parallel highlighting, synchronized scrolling and navigation functions has received raving reviews.
A user-friendly UI concept is only one of our approaches to bridge the gap between a current need for our service and the current limitations. In focus groups we have learned that ease of access to our service is being expected from a wide range of other Microsoft properties. Hence, a seamless integration into other communication and authoring tools became a vital part of our mission to create a better user experience for the consumers of o http://gallery.live.com/default.aspx?pl=3ur translation service.
The Windows Live Translator toolbar button gives immediate access to the Bilingual Viewer experience from wherever you are on the web. Our friendly Translation Bot TBot can either translate text for you using the Windows Live Messenger, or serve as your personal chat interpreter between you and your international buddies. Internet Explorer 8 has the translation service right built into its Accelerators, offering text or full page translations with as much as a mouse hover or click. If you wish to use our translation service directly from Office Word, you can do this today without the need to wait for a new Office release. And yes – full document translation is delivered in bilingual view. Of course, the same functionality is also available to you in Office Outlook, if you have chosen to display it in Office Word mode. We also would like site owners to benefit from our offering if they’d like to make their pages available with free translations. A simple copy/paste action is all it takes to Add our web page translator to your site.
In further acknowledgement of the limitations of machine translation, we also offer a direct link to an affordable professional on-demand translation team, delivering human translations sometimes in a matter of hours.
And we are not done yet … please stay tuned for new releases of more and better features, experiences, and integration scenarios. And please do continue to post your wishes to our blog. We read all your posts carefully and will factor in your feedback in the planning of our next design steps…
Andrea Jessee, MSR-MT User Experience PM
The Translator team is excited to announce the availability of the English to Russian language pair on MicrosoftTranslator.com. This language pair is now available across all implementations of the translator technology, including Live Search and (in the next few days) the Windows Live Messenger TBot.
You have probably noticed that the Russian to English language pair has been available for some time on our site. As always, translation quality is a top focus for our team. Sometimes reaching quality takes longer for a particular direction – this can be for many reasons. For example, if you are translating between a simple language and a complex language, the translations will be better going to the simple language than they will be going to the complex language. If you are interested in learning more about the technology behind our machine translation engine, see Will Lewis’ blog post on statistical machine translation.
While machine translation is certainly never perfect, for this new language pair we have now hit our quality bar for release. How do we determine the quality bar? In general, when the translation can be considered “useful”. We consistently receive feedback from our users that imperfect translation which is useful is better than no translation. So we have to balance user demand with translation quality. With that in mind, we test our language pairs with human evaluations, until we have reached “useful” translation.
We are always open to your constructive feedback and help – please continue to help us so that we can keep improving quality! We are always very grateful for good feedback.
Some other updates in this release you may notice:
· We have officially migrated our domain to www.microsofttranslator.com
· Improved quality across several language pairs, due to improvements in training data quality
· Improvements in Japanese to English, due to an improved method of parsing the training data