Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
A couple of days ago when I wrote about how In Vista, jackdaws appear to be somewhat endangered, I mentioned
...both strings are actually in Message Compiler resources which means they could actually be localized (though note that the above algorithm means that localization might make the situation worse here, not better. On top of that, what do you do when you have a font with no latins in it? By this algorithm, they will just get another Latin script string which will still have to use font linking to find the glyphs to display.
With the help of Claus Juhl (you my recall him from the Channel 9 video I posted about last August), I was able to look to see what the localizers for the various Windows language releases did to to both strings:
Here are some of the highlights....
First of all, Jackdaws love my big sphinx of quartz was not localized for any language. You can contemplate what this means for the algorithm I posted. :-)
Second of all, this is clearly a problem like the one from 'Cette phrase en français est difficile à traduire en anglais', since it clearly not intended that an actual translation of 'The quick brown fox jumps over the lazy dog' be done. What is desired is a pangram covering the letters in the target language.
Let's see how it worked out with a bunch of those languages:
Arabic: من طلب العلا سهر الليالي. Bulgarian: Вкъщи не яж сьомга с фиде без ракийка и хапка люта чушчица! Chinese (PRC): The quick brown fox jumps over the lazy dog1. Chinese (Taiwan): 微風迎客,軟語伴茶 Czech: Příliš žluťoučký kůň úpěl ďábelské ódy! Danish: Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen Walther spillede på xylofon. Dutch: Pa's wijze lynx bezag vroom het fikse aquaduct. Finnish: Tämä on malliteksti. French: Voix ambiguë d'un cœur qui au zéphyr préfère les jattes de kiwis. German: Franz jagt im komplett verwahrlosten Taxi quer durch Bayern. Greek: Θέλει αρετή και τόλμη η ελευθερία (Ανδρέας Κάλβος). Hebrew: דג סקרן שט לו בים זך אך לפתע פגש חבורה נחמדה שצצה כך. Hindi: सारे जहाँ से अच्छा हिंदोस्तां हमारा. Hungarian: Árvíztűrő tükörfúrógép ÁRVÍZTŰRŐ TÜKÖRFÚRÓGÉP Italian: Cantami o Diva del pelide Achille l'ira funesta. Japanese: Windows でコンピュータの世界が広がります。 Korean: 다람쥐 헌 쳇바퀴에 타고파. Norwegian: En god stil må først og fremst være klar. Den må være passende. Aristoteles. Polish: Zażółć gęślą jaźń. Portuguese (Brazilian): abcdefghijklmnopqrstuvwxyz. Portuguese (Iberian): A rápida raposa castanha salta em cima do cão lento. Russian: Съешь еще этих мягких французских булок, да выпей чаю. Slovak: Kŕdeľ ďatľov učí koňa žrať kôru. Slovenian: V kožuščku hudobnega fanta stopiclja mizar in kliče Spanish: El veloz murciélago hindú comía feliz cardillo y kiwi. La cigüeña tocaba el saxofón detrás del palenque de paja. Swedish: Flygande bäckasiner söka hwila på mjuka tuvor. Turkish: abcçdefgğhıijklmnoöpqrsştuüvwxyz.
Arabic: من طلب العلا سهر الليالي.
Bulgarian: Вкъщи не яж сьомга с фиде без ракийка и хапка люта чушчица!
Chinese (PRC): The quick brown fox jumps over the lazy dog1.
Chinese (Taiwan): 微風迎客,軟語伴茶
Czech: Příliš žluťoučký kůň úpěl ďábelské ódy!
Danish: Quizdeltagerne spiste jordbær med fløde, mens cirkusklovnen Walther spillede på xylofon.
Dutch: Pa's wijze lynx bezag vroom het fikse aquaduct.
Finnish: Tämä on malliteksti.
French: Voix ambiguë d'un cœur qui au zéphyr préfère les jattes de kiwis.
German: Franz jagt im komplett verwahrlosten Taxi quer durch Bayern.
Greek: Θέλει αρετή και τόλμη η ελευθερία (Ανδρέας Κάλβος).
Hebrew: דג סקרן שט לו בים זך אך לפתע פגש חבורה נחמדה שצצה כך.
Hindi: सारे जहाँ से अच्छा हिंदोस्तां हमारा.
Hungarian: Árvíztűrő tükörfúrógép ÁRVÍZTŰRŐ TÜKÖRFÚRÓGÉP
Italian: Cantami o Diva del pelide Achille l'ira funesta.
Japanese: Windows でコンピュータの世界が広がります。
Korean: 다람쥐 헌 쳇바퀴에 타고파.
Norwegian: En god stil må først og fremst være klar. Den må være passende. Aristoteles.
Polish: Zażółć gęślą jaźń.
Portuguese (Brazilian): abcdefghijklmnopqrstuvwxyz.
Portuguese (Iberian): A rápida raposa castanha salta em cima do cão lento.
Russian: Съешь еще этих мягких французских булок, да выпей чаю.
Slovak: Kŕdeľ ďatľov učí koňa žrať kôru.
Slovenian: V kožuščku hudobnega fanta stopiclja mizar in kliče
Spanish: El veloz murciélago hindú comía feliz cardillo y kiwi. La cigüeña tocaba el saxofón detrás del palenque de paja.
Swedish: Flygande bäckasiner söka hwila på mjuka tuvor.
Turkish: abcçdefgğhıijklmnoöpqrsştuüvwxyz.
My favorites are Japanese and Iberian Portuguese.
How about yours? :-)
You may be wondering why I tagged this post with 'Unicode Lame List' -- just keep in mind how poor all of these sentences are at dealing with the issue of showing what makes a font unique to any user who might be curious. Just remember, it is not really the localizers who are lame here -- it is the implementation....
1 - When it doubt, don't translate? :-)
This post brought to you by ဆ (U+1006, a.k.a. MYANMAR LETTER CHA)
Judging by those sentences, I guess I can claim to understand... Chinese (PRC), Portuguese (Brazilian), and Turkish.
What is it about the Japanese that you like? It just says that the computer world widens with Windows. Of course that has nothing to do with original two sentences, like most of the other "translations".
"Pack my box with five dozen liquor jugs"
Chinese (Taiwan): 微風迎客,軟語伴茶 Greek: Θέλει αρετή και τόλμη η ελευθερία (Ανδρέας Κάλβος). Spanish: El veloz murciélago hindú comía feliz cardillo y kiwi. La cigüeña tocaba el saxofón detrás del palenque de paja. και άλλα
The Japanese one just seemed kind of silly to me, meeting very little of the requirements here.
But many of the translations are actually good pangrams for the given language, which makes them good translations.
The Iberian Portuguese one is the silliest since it is a literal translation of the original source sentence.
(And of course recognition of one piece of the UI of a localized version does not indicate language fluency!)
Our little Hungarian pangram ("árvíztűrő tükörfúrógép") means "flood-proof mirror-drill". :)
It's pros include that it's short (it's easy to type this into a new mobile phone when I want to test it's Hungarian-specific capabilities).
The con is that it really isn't a sentence. That makes it less... typographigillishistically looking... If you know what I mean. :)
I thought the Chinese (Taiwan) sentence was rather clever, combining an elegant couplet with a play on the name of Microsoft.
微風迎客,軟語伴茶
A gentle breeze welcomes the guest,
Soft talk accompanies the tea.
If you take the first character of each line and put them together you get the Chinese for Microsoft : 微軟.
Also on the Iberian Portuguese - not only is it a literal translation but it actually uses the fewest letters of all of them! Fewer than 20, if I'm counting correctly (and depending on whether you count the letters with diacritics).
The result obviously depends on whether the translator recognised that the original phrase is an English pangram and translated the intent, rather than the result. The Finnish version looks suspiciously short to be a literal translation (not that I know any Finnish) but doesn't really serve the purpose either.
i've started to translate my native one(Russian):
and found this: http://en.wikipedia.org/wiki/The_quick_brown_fox_jumps_over_the_lazy_dog
PS and even more here: http://en.wikipedia.org/wiki/Pangram
=)
Oh, I forgot to note another little, small, minor con with the Hungarian pangram: it's not actually a pangram.
So it doesn't display all the Hungarian characters, it only pays attention to the diactric letters of Hungarian, which might be mor problemmatical, especially őŐűŰ.
The sentence shown for Hindi does not use all the possible letters and their conjuncts. The display forms for the conjuncts for Indic are different from just putting side by side the alphabet from which they have been derived. That is the domain of opentype font and complex scripts. I don't have to explain these to you. Hence if the purpose is to show all the possible display forms for Indic then the total number of such combinations maybe anywhere near 15,000 per script.
-Pavanaja
Well, I am not suggesting that all forms need to be displayed, any more than I am suggesting that Chinese needs 70.000+ ideographs displayed!
The idea would be a sentence that has every letter used....
The German version doesn't show any of the special characters (ä,ö,ü,ß) either. But at least it shows all "basic" letters from the alphabet.
Knowing what the Hungarian pangrams mean - I really like that
Michael,
Even with that +, Chinese needs well more than 70,000. There are still a significant number of historical characters that I can not type. Hopefully Extension D will begin to solve that.
The Swedish one is a well-known typographic test sentence, our version of “The quick brown fox…”, so it is a good localization.
The Finnish version, if I read it correctly, just says “this is a sample text”.