Prof Jean Véronis, who teaches computational linguistics at the French University of Aix-en-Provence, has a very interesting blog on natural language technologies. A few days ago, he posted a nice and well-balanced evaluation of our new French Office 2003 spell-checker, which he compared to the Google Toolbar (it’s in French, so I thought this brief summary in English would be nice).

 

His analysis is very interesting. He basically agrees with some of the arguments I had listed in a comment I had posted on one of his earlier blogs. In his evaluation, he shows that one should take into account proper names, for instance. Users are indeed interested in having their texts spell-checked and corrected, whether they contain proper names or not (basically, if you write Londen, you don’t really care whether it’s a proper name or not: you just expect your speller to tell you you’ve made a mistake and to suggest London instead, if that's what you meant).

 

Prof. Véronis took them into account to calculate the precision and recall of the new Microsoft speller, of the preceding version and of the Google Toolbar. As one can see in the table below, he comes to the conclusion that the precision of the new Microsoft Word speller is much better than that of the Google Toolbar. This can be explained fairly easily: the Google speller apparently has a very poor lexicon of geographical and person names (Londres, Madrid, New York, Moscou, Singapour or Chirac… are considered as mistakes by the Google Toolbar, very much like real spelling mistakes like Londre and Chriac ; the MS Word speller will only flag Londre and Chriac (and rightly so) and will not unnecessarily attract the user’s attention to mistakes which are not real mistakes (also known as “false flags”). The Word speller will also suggest the appropriate corrections if there is a real mistake (if you type Londre, the French speller will tell you it should be Londres).

 

Here are Prof. Véronis’ evaluation metrics, taken over from his blog. Noise corresponds to the percentage of false flags, i.e. words which are correct, but which a speller flags as incorrect because they are not in its lexicon.

 

%

Noise

Silence

Old MSWord speller

21,7

23,5

New MS Word speller

9,3

20,0

Google

34,7

22,4

 

 

 

 

As one can see, our speller is much better and more useful than the Google Toolbar because we have a much richer and better lexicon. More than a third of the “mistakes” (34.7 %) found by the Google Toolbar are in fact “false flags”, while our new French speller has less than 10% of false flags. Of course, no lexicon on earth can contain all the possible words in a given language (new words are being created every day), but some tools are clearly better than others…

 

Readers might wonder why there is such a discrepancy. I think the reason is fairly simple: we have worked a lot to develop, improve and test our lexicons and our spellers, to reflect the evolution of the language, to capture the most frequent words in the language, including important proper names. This makes the tool very useful for our end-users and I also have to add that our synergies with our Encarta colleagues have enabled us to make use of their databases of geographical names and famous people’s names to enrich our dictionary and hence reduce the noise. I guess the quality label I was telling you about the other day is evidence that we made the right decisions…

 

I was also very pleased to read that Jean Véronis had been impressed by the new versions of our other proofing tools like our new French grammar checker. Language is always a very sensitive issue, people are usually very quick to criticize proofing tools and getting such comments on the quality of our linguistic tools from such a well-known and respected expert is always pleasant...[:)]

 

Thierry Fontenelle [MSFT]

 

Microsoft Speech & Natural Language