Sorting it all Out Michael Kaplan's random stuff of dubious value Be sure to read the disclaimer here first!
Ok, it is time for one of my periodic delusional episodes (you know, those delusions of linguistic aptitude I have from time to time.
(this post pre-recorded, a little blog experimentation!)
Now there is disambiguation, a word which may have already existed (it did according to a colleague who is in fact a linguist). But it was spontaneously reinvented by a program manager presenting at a developer conference, trying to describe the process by which identifiers in VBA are resolved. And it had been a party that went late into the night on the evening prior, and he was tired. Maybe even a little hungover. He was explaining that if the name has not been bound to anything yet, that what it is meant to refer to is ambiguous. This set the stage for his next words -- that VBA had to look through the references to disambiguate the name.
He thus introduced the word into the cosmic consciousness of VBA developers.
But I was going to talk about identfiers.
When I had to write the managed version of IsNLSDefinedString for Whidbey, what to call it was an interesting question. I suspect that largely on the basis of no one else really caring what it was called, no oneobjected when I dubbed it IsSortable. I actually had more than one person ask me later if sortable is a real word (to which I of course responded that English is a productive language, yada, yada, yada). On the longstanding precedent of the Server 2003 IsNLSDefinedString function in addition to weightless strings, unpaired surrogates and private use characters will cause them both to return FALSE. And while several people have asked why (since both do have some kind of sorting weight), people have stuck to their guns on this one -- there is no useful cross-machine usage for either unpaired surrogates or for private use area characters in identifiers like machine names.
It may seem somewhat pretentious, it may even be a little pretentious. It is just not a good idea, and maybe by having a method that calls itself IsSortable, people can be influenced about the idea of using these things in machine names and identiers in programming languages and such. The former might be possible (Active Directory uses us for collation after all), but the latter is of course a pipedream, since programming languages that allow attrocities like this will not even blink before allowing these "unsortable" characters in identfiers.
But is there something wrong with using IsSortable here? It is not like the naysayers who questioned its validity as a word had a better name they could suggest. And the method is referring to strings being used in collation operations, which do prefer meaningful strings anyway. Maybe IsDefined would have been better, but people seemed reluctant to have too new concepts added. If people were to ask what were the consequences of being undefined, the answer would be that you could not sort them effectively -- so we'd have explain what it meant to be sortable, anyway. So the current plan has fewer concepts to explain. :-)
Now yesterday, I was talking about the TextRenderer class. If one has the job of rendering, I suppose one is a renderer. And if one is rendering text, one is a text renderer. And English is a productive language, yada, yada, yada. But is it a word one would usually use in this context? It seems like it is more common to put the word Render in a method. And obviously when one looks at the methods on the class, it has two actions -- measuring and drawing. It is kind of a stretch to say they both fall under the category of the act of rendering. So what makes this usage seem okay? Or maybe nothing does and people just shake their heads.
Which gets me back to what you could (loosely) call the subject of this post.
The grammar of identifiers is a sparse one, meant to be the consistent application of a limited number of concepts. If I were creating a property on System.String for this, it would just be String.Sortable or String.Defined. But any time they have to be methods (like when they take parameters), they have an Is prefix, like all the char.Is* methods. Maybe calling a class TextRenderer feels weird to some because classes are supposed to be "pure" nouns, and not just the noun forms of verbs. Or maybe it just feels weird since the scope of what the class represents is not fully covered by the name. All as if we are trying to create an actual language that one could use to communicate the concept of the program.
Of course to non-programmers it may appear that programmers are talking like people with developmental disabilities, and some linguist may even balk at the idea of calling a programming language an actual language, but in truth one can communicate some very complex ideas. And every time I write a program in an object oriented language, I am extending the language. Well, maybe I am not unless I am adding something to the BCL, which in my case actually can happen.
But that raises another interesting idea -- when I create methods and properties in a new program, am I creating a dialect? One that you only speak if the appropriate terms are in scope for you? And if that is true, what does it mean to be the author of a much-used library -- are you the programming equivalent of Académie française
Of course, as Raymond Chen pointed out, sample code often tries harder to be multilingual, for obvious reasons. :-)
Are we creating language here? Or are my delusions of linguistic aptitude confusing me?
I wonder if anyone has ever studied this before, on the linguistic side.
This post brought to you by "Ӓ" (U+04d2, CYRILLIC CAPITAL LETTER A WITH DIARESIS)
Back in Part 1 of this two part series where I started talking about how there are Two things that suck
The issue was something I was aware of before Doron mentioned the word in Once not disableable, forever