Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
A few months ago, I was talking with a customer who simply could not understand the sorting results she was seeing (in this case in a table in MS Word 2003). She distilled it down to a small repro; basically she took a small list of words:
(at this point I knew both the language and what was causing her to have problems, and you may know too!)
What she noticed was that if she marked the left column as being French text (she tried several French choices, including France and Canada), the order was like this:
while if column was marked as being English then the results looked like this:
She could not understand any sorting rules that would explain the way that the words were sorting in the "French" table.
So I talked with her about Académie française and how they have a specific preference related to the way letters with diacritics are to be sorted (I also mentioned incidentally that I thought they had abolished the use of the circumflex in the early 1990's in many words but honestly I did not know if it applied to these two words that use the SMALL O WITH CIRCUMFLEX!).
The specific rule I was talking about here is that diacritics are evaluated from right to left rather than from left to right. Thus côte comes before coté, rather than after it as it does in languages like English that evaluate them from left to right. Because the word côte has no ACUTE on the "e" at the end of the word while coté does. In English and most other languages, the evaluation starts on the left and therefore the CIRCUMFLEX or lack thereof on the "o" is the controlling factor in ordering.
You can see it described in the French sort order exanple in Appendix D of the first edition of Developing International Software for Windows 95 and Windows NT.
This particular rule is interesting in that in all of the native French speakers with whom I have spoken, I never found anyone who could explain the rule to me. In their defense they were pretty much all aware that there were special rules used in dictionaries, but if you think about it there would seldom be a time that one could not find the word one wanted in a dictionary that used this rule. After living with the language for a lifetime, I am sure things like this are simply understood subconsciously when they occur. This phenomenon is common in almost all languages and they pretty much all have rules that native speakers understand even if the speakers cannot articulate the rules.
Another interesting factoid about language that can be seen here has to do with the fact that this use of "reverse diacritics" is seen in every French locale supported by Windows. It is fascinating to see the influence that the "mother country" of a language can sometimes have on changes that are made to other places where it is spoken.
When changes are made, whether by longtime organizations such as the Académie or by direct legislation, other countries will in many cases tend to pick up those changes. To me, the reasons behind such language reforms spreading this way are fascinating to contemplate. It is certainly not any kind of sovereignty or true languge "ownership" issue (and in future posts I may discuss specific cases in other languages where changes were at times intentionally not picked up!).
But I am at times amazed at the way that people will appear to see language as transcending the petty things. Its the kind of behavior that makes me interested in linguistic issues. :-)
This post brought to you by
I was having a conversation with someone at the 30th Internationalization and Unicode Conference earlier
Question classique, quel est le paramètre de page de code, tri et comparaison (collation) à définir dans
# re: French collation: When diacritical becomes diabolical
Thanks so much for this information, all new to me, including the elimination of the accent circumflex.
I agree w/ David said above, except for dictionaries and the most demanding academic needs, I would ignore accents when sorting words with diacritics.
BTW, anyone knows a way to sort in alpha order an html list where the lines are hyperlinked? -- Mod Mekkawi, Howard U, Washington, DC
Yes, these changes are awful for linguists and translators, we really have to battle to keep updated... www.alafrench.com for French translation services by the way.