(Prof Jean Véronis, Aix-en-Provence)
The recent launch of the French version of OpenOffice a few days ago incited me to expand my comparison of spell-checkers. In a previous study [here], I had compared the Microsoft Word speller with the spell-checking function in the Google Toolbar. The advantage was clearly in favor of MS Word (with the patch – new version - which it is important to download), this advantage being mainly due to a good treatment of proper names and, to a lesser extent, to its grammar checking functionalities (in particular for agreement problems). We are going to see that, with OpenOffice, the match is tighter.
I kept the same text for the evaluation (an article from the newspaper Le Monde which I submitted to my "spell-wrecker" : here). Here are the results : (the noise refers to false flags and silence refers to missed flags, i.e. mistakes that have not been spotted):
MSWord (with Patch)
Without proper names or foreign words
With proper names and foreign words
If we don’t take into account proper names and foreign words used in the text, OpenOffice has slightly fewer false flags than MSWord, but fails to spot more mistakes. The tendency is the same if one takes into account proper names and English words: a little less noise and more silence. It is important to use the correct settings to make sure that OpenOffice uses the option "detect all languages": language detection seems to work fine, at least on my example. The sentence "Do you like roast-beef?", cited in English in the text, is identified as an English sentence by OpenOffice, while it is not in MSWord. (One has to add that language detection on so small fragments is a very delicate process). As we can see, the results are very close. On the whole, there is a slight advantage for MSWord when compared to OpenOffice (Google is far behind). I don’t want to be too technical here, but it can be measured more precisely if one penalizes noise and silence in the same way, using F-measure (which smoothes out precision and recall) : this F-measure is slightly higher in both cases for MSWord (87,4% vs. 85,4% in the first case; 85,0% vs. 81,8% in the second case).
The performance of the OpenOffice speller is not too bad if one takes into account its open nature and the more limited means they have for their development. Of course, more in-depth analyses should be carried out on a larger scale, with other types of texts and my experience has only an indicative value. However, I think that the developers of the French version of OpenOffice should be vigilant. Microsoft has obviously reopened the development of their French proofing tools, with a very competent team. On some fronts, its conceptual lead is very strong, even if these figures do not yet show it very much. Grammar checking is a case in point here, a field in which, as I pointed out the other day here, Microsoft is now noticeably improving things. Let us also point out that OpenOffice does not yet integrate the spelling reform which (since 1990...) has been recommended by the Conseil Supérieur de la Langue Française and the Académie Française, for words such as règlementaire, révolver, ambigüe, etc. (but it would be fairly easy to integrate it).