Collation != case, still

Sorting it all Out
Michael Kaplan's random stuff of dubious value
Be sure to read the disclaimer here first!

Collation != case, still

  • Comments 9

Richard asked in the Suggestion Box, and I decided to dispatch quickly:

Why is it that English (en-US, because there is no en-GB) Windows and .NET don't know how to upper case a Latin Small Latter Sharp S even with the de-DE locale specified:

"\u00DF".ToUpper(CultureInfo.GetCultureInfo("de-DE"))

does not return "SS", but "ß"?

The Unicode casing file CaseFolding.txt has

00DF; F; 0073 0073; # LATIN SMALL LETTER SHARP S

Is this a Window's limitation? (Which would not help, given I'm trying to put together a demo of doing the right thing to build I18n into an application update.)

This is a question I have talked about many times in the past, as a simple search for U+00df indicates. And most importantly, since Casing and IgnoreCase are still not the same thing and Collation != Case (a.k.a. Collation <> Case), for now this is how casing will work on Microsoft platforms -- what Unicode refers to as simple casing....

 

This post sponsored by "ß" (U+00df, LATIN SMALL LETTER SHARP S)

Comment on the blather
Leave a Comment
  • Please add 4 and 7 and type the answer here:
  • Post
Blog - Comment List
  • I know there are "good reasons" for things working the way they do...but anybody who has had even a semester of high-school German knows that ß upper-cases to SS; that is, "Straße" (street) becomes "STRASSE" (although I seem to recall that perhaps the rules are different in Austria and/or Switzerland?)

    Since in .NET, ToUpper() returns a new string, it "should" be easier to "fix" this problem in that enviroment.
  • Well, "should be" is a relative term -- it is still using the same casing tables to do the work. We are more flexible in collation so we give the support....
  • Thanks...

    (Sharp S search failed to find anything... didn't try just the code point.)
  • Or, rather I should say,

    Search for "Sharp S" failed to find anything about case folding (quite a few hits around collation/equality.)
  • At the beginning of the week I posted Part 0 of this series, so I figured I should start the series at

  • There are several scripts that have the notion of case, like Latin, Cyrillic, Greek, Armenian, Coptic,

  • That night I saw in the pipeline fair A character that wasn't there Non-existence won't stop the encoding;

  • SQL Server likes to keep a bit of independence from the operating system. At the same time, they like

  • I am writing this blog from my own laptop waiting in the ER at the hospital (all of the quotes are from

Page 1 of 1 (9 items)