To date, I haven't spent very much time in the MSDN Forums.

Not that they are not cool or anything like that. I think that the work that Josh Ledgard and others have put into the setup and that so many community members have put into answering questions has been very cool

It is just that there are so many hours in a day, and between my actual work and the limited time in which I allow myself to have an actual life, reading blogs and posting to this one takes up whatever time is left over. A whole new place to be looking is just a bit more than I can handle in my spare time.

ButI think it is an awesome resource because of its differences. The truth is that some people will be comfortable with newsgroups, others with blogs, still others with PSS phone calls, and yes -- some with the MSDN Forums. And so on.  Since the goal is to get the questions answered so that people are helped, giving multiple different means of providing assistance helps make sure more people can get what they need.

Which is not to say I never make it to the MSDN Forums. Because from time to time, Stephen Fisher will notice an 'international' question that has not yet been answered and he'll have me take a look.... :-)

Anyway, a few weeks ago, Carl M. asked the following question here:

How can I get the descriptio(name) of a char in English? Assume the string it comes from is normalized.

public static string GetDescription(char c){
   
//? how to return the description
}

For example GetDescription('ñ') should return "Latin small letter n with a tilde"

What about composite characters like most of the Hebrew letters?

Thanks in advance

Carl

Unfortunately, there is nothing in the .NET Framework that will return the Unicode character names. These are not produced algorithmically but are instead assigned when characters are added to Unicode.And although they usually seem to follow nice, neat rules there are plenty of them that are not intuitive or understandable.

Occasionally there is a bug where the name is not even correct! But the rules are clearly laid out in #2 the Stability Policy for the Unicode Standard:

2. Name Stability

Applicable Version: Unicode 2.0+

Once a character is encoded, its character name will not be changed.

The character names are used to distinguish between characters, and do not always express the full meaning of each character. They are designed to be used programmatically, and thus must be stable.

In some cases the original name chosen to represent the character is inaccurate in one way or another. Any such inaccuracies are dealt with by adding annotations to the character name list (which is printed in the Unicode Standard and provided in a machine-readable format), or by adding descriptive text to the standard.

Note: It is possible to produce translated names for the characters, to make the information conveyed by the name accessible to non-English speakers.

So for every character there is one and only one official name, and to get that name you have to be using something that is storing the actual name.

Which the .NET Framework does not have. It is not something that is even very common as a functionality in Microsoft products (well, MSKLC has them and so does Character Map in Windows (both of them use a slightly friendlier proper cased name rather than the official ALL CAPS one). But there is no public function in the Win32 or .NET APIs to provide the info.

Though of course it is always something that can be considered for a future version if the scenarios are compelling enough -- so if you have a requirement then feel free to explain your scenario here. :-)

On the Unicode side, there are plans afoot to provide mechanisms to fix really awful mistakes without violating those stability guarantees, something that I will talk more about when it becomes a reality.

 

This post brought to you by "" (U+1886, a.k.a. MONGOLIAN LETTER ALI GALI THREE BALUDA)