Why Is The Return Type Parameter Last?

Why Is The Return Type Parameter Last?

Rate This
  • Comments 26

The generic delegate type Func<A, R> is defined as delegate R Func<A, R>(A arg). That is, the argument type is to the left of return type in the declaration of the generic type parameters, but to the right of the return type when they are used. What’s up with that? Wouldn’t it be a lot more natural to define it as delegate R Func<R, A>(A arg), so that the R’s and A’s go together?

Maybe in C# it would, but in this case, it’s C# that’s the crazy one.

When we speak it in English, the argument type comes before the return type. We say “length is function from string to int”, not “length is a function to int from string”.

When we write it in mathematical notation, we say that a function’s domain and range are defined as f:A→R – again, the “return type” comes last.

And in many languages, the return type of a function comes in the sensible position. In VB it’s Function F(arg As A) As R.  In JScript .NET it’s function F(arg : A) : R.

And finally, consider higher-order functions; say, a function from A to a function from B to C. You want to think of this as A→(B→C); do you really want to write that as Func<Func<C, B>, A> ? This is completely backwards. Surely you want A→(B→C) to be represented as Func<A, Func<B, C>>.

C# gets it wrong because C# inherits the basic pieces of its syntax from C, and C gets it wrong. Well, no, rather, it would be more fair to say that C is a non-typesafe, non-memory-managed language where it is vitally important that the code maintainer understands the lifetime and type of the data in every variable. Given that unfortunate situation, it makes sense to emphasize the storage mechanism first, and then the semantics second. Therefore in C you put the storage metadata first (static int customerCount;) rather than the semantics first (it could have been var customerCount: static int;). Once you’re in the position where the type comes first on variable declarations, it makes sense to apply the same rule to all other kinds of declarations – methods, formal parameters, and so on.

It might have been nicer back in the early days of C# to say “you know, we have a type-safe, memory-managed language, let’s do what VB does, de-emphasize the type mechanism and put the type as an annotation on the end”. We could then make that consistent throughout the language so that Func<A, R> referred to delegate Func<A, R>(arg : A ) : R. But that ship has sailed and we’re stuck with the declaration syntax we’ve got.

  • I doubt there is a "right" here. There is simply a distinction: arguments and return. I suspect (no research here) that K&R found the grammar easier to simplify (or terser: otherwise you may be compelled to use the "AS" or -> token) by putting the return type first- no performance, resource management or type-safety gain by putting that information first or last, as there is no semantic significance to choosing one sequence over another.

    That conventional "math" notation tends towards args first isn't semantically significant either, but your exposition is helpful for setting mental analogs. I liken this to driving on the left side vs right side: either is fine, but remember which country you're in and respect the rules of the road when you're there!

  • Bah!

    @Greg nobody designs a language for easy grep'ing, it's just a side-effect.

    @George +1

    I would disagree that either C or C# gets it wrong. It's just a convention, and it wouldn't be called C# if it didn't resemble C / C++. As @George says, there is no inherent type-safety or resource management gain.

    Mathematicians don't always agree on their symbols - it's not like Euclid set everything in stone eons ago and since then mathematical notation has remained unchanged.

    Things evolve. Decisions were made, and language designers attempt to add new features without breaking existing code. You know this well, and not all decisions are based on their pure mathematical counterparts. If mathematical purity was the top of the list, then Linq or other post-Unicode languages would use the Unicode mathematical symbols for operators Union (U+222A), Subset (U+2282), or Intersection (U+2229), etc - but to my knowledge nobody really cares that much.

    All sorts of programming constructs are done for a variety of reasons, and I would expect terse but expressive as well as ease of maintenance are never far from the minds of the language designer, not to mention the task of writing the compiler itself.

    For what it's worth, I find the Func<T,TResult> to be annoying and counterintuitive even if allegedly mathematically superior.

    When you provide overrides to Func<T1,T2,TResult> it seems more natural (from a C# background) to want to do Func<TResult,T> and Func<TResult, T1, T2> based on the almost universal assignment

    x = Func(a);    // Func<x,a>, but C# does Func<a,x>

    x = Func(a,b); // Func<x,a,b>, but C# does Func<a,b,x>

  • @Greg

    >>> Return type to the left of the function name makes it possible to text search for all functions that return type ABC as well as see the return type of function DEF.

    For C#, maybe (I'm not 100% sure), but for C/C++, try to write a regex that would handle e.g. functions returning function pointers...

    In practice, for tasks such as one you describe, you really need proper tooling. E.g. IntelliJ IDEA has a "Java code search" feature with a pattern matching language that lets you match code constructs (rather than text).

    >>>  C, maybe pre-Posix, let you return and dereference a pointer on the left hand side of an asignment statment.

    Not sure what you mean here - you can dereference a pointer on left side of assignment in virtually every language that has pointers and assignment; including e.g. C#.

    @George

    >>> I suspect (no research here) that K&R found the grammar easier to simplify (or terser: otherwise you may be compelled to use the "AS" or -> token) by putting the return type first

    More likely, they just followed the type-first scheme for variable declarations, and those they have sort of inherited from B (not as types, but as storage modifiers).

    Terser by design is also likely; after all, we're talking about guys who have decided what = and == should mean (i.e. which one should be assignment, and which comparison) by calculating the frequency of each operation, and using the shorter token for the more frequent operation (which turned up to be assignment).

  • - Text searching application source code works well when you have a large code base of different applications.

    - VS's tools do not let you find all references to a function both in the current solution and all of the solutions you have in all of the other applications making up your entire code base

    - Text searching finds references to your function in places that are not searched by VS find all references ( strings, scripts, dynamically compiled code, code invoked by finding a function name in metadata, etc.).  This helps to identify and fix areas where the developer overengineered an application (e.g., creating and using a web service that is only called from one application -> solution is to move the web service inside of the calling application)

    - Large scale code refactoring is eased with code formatted for easy text searching.  Brute force refactoring for duplicate/near duplication function finding is easier via text (findstr | sort | findstr then look at the sorted list in a text editor.  Reflection could help but requires compiled code which may be too costly to build).

    If you come into a client's office, develop 50% of an application and then leave before it is in production for 3 months, you will not need to do text searching.  Taking the same application through 4 or 5 major development cycles and supporting multiple different versions in production use by your customers requires text searching.  Text searching is most beneficial for code that was partially developed and never supported post production by consulting companies.

  • I'd like to second Stefan Wenig's syntax suggestions. And add T* as shorthand for IEnumerable<T>

    So we could declare the type "function that maps a list of customers to a list of addresses" like this:

       Customer* -> Address* getAddresses;

    Equivalent to:

       Func<IEnumerable<Customer>, IEnumerable<Address>> getAddresses;

    And then there's lifting, which already happens for Nullable, but it would be cool for other things to have that expressiveness. How about:

       e..Foo()

    as shorthand for:

       e.Select(i => i.Foo()) assuming e is a T* ("lifting")

    Allowing us to init the function we declared before:

       getAddresses = customers => customers..GetAddress();

    As (in line with the Linq query keywords) the .. lifting operator would expand to call Select, so you could define how lifting would work on other interfaces besides IEnumerable<T> by writing your own Select extension method.

    Which reminds me, how about also letting the operator overloading map to special method names just like the linq query keywords do? So, assuming nothing better is available (i.e. this would fail to compile in C# 4.0):

        var result = a + b;

    But in some future version the compiler would (as a last resort) try:

        var result = a.Plus(b);

    Obviously the current C# static operator overloading system is more ideal for the cases it already handles, because of the symmetry imposed by the lack of virtual dispatch on the LHS, as Eric has previously discussed.

    However, by allowing the operators to also map onto special method names, as a fallback mechanism, we'd get some nice advantages added to the language in a backward-compatible way. And it only seems reasonable given that the linq keywords work that way.

    e.g. define an extension method:

       public static T* Plus<T>(this T* source, T* other)
       {
           return source.Concat(other);
       }

    We'd now be able to concat two sequences with: a + b.

    Or with this:

       public static TOut Pipe<TIn, TOut>(this TIn source, Tin -> TOut maybe)
       {
           if (source == null)
               return null;
           return maybe(source);
       }

    We could say:

    return Music.GetCompany("4ad.com")
           | company => company.GetBand("Pixies")
           | band => band.GetMember("David")
           | member => member.Role;

    And if any of those steps produced null, the returned result would be null, i.e. maybe monad.

    I know it's a long shot, but no harm in asking, eh? :)

    Indeed, no harm. These are good ideas.

    I think if we were designing C# from scratch knowing what we know now, we'd probably go with a pattern-based approach to operator overloading like we do with query comprehensions.

    "*" is nice, but unfortunately it is too easily confused with the pointer syntax. We considered making "T{}" a shorthand for IEnumerable<T>, which is nice because it has a good symmetry with T[]. Unfortunately it did not meet the bar for C# 4.0. Maybe in a hypothetical future version.

    And finally, it is a little-known fact that C# does have a ".." operator. It only works when you type an expression into the watch window in the debugger! (UPDATE: Whoops, I am mistaken. See below.)

    -- Eric

  • @Daniel T* is a pointer to T already, you'd have to find another symbol.

    besides that, monads already inspired LINQ, and I don't see why they should not inspire other new C# features. just as long as C# does not try to embrace the generic concept of monads. if that's what you want, you should defintely consider switching to another language.

  • @Greg

    Proper searching is of course very useful and important, but I think it's a problem for tools to solve, and not something the language design should consider.

  • Good to know that these things are being discussed.

    Personally I'd be happy to leave unsafe blocks with only the verbose syntax, as they're (hopefully) rare beasts anyway.

    (I also thought about suggesting -> for the lifting member access operator instead of .. to correspond with C/C++, but I think that would have been maliciously confusing as I was was also used that symbol Haskell-style func-type declarations in the same post!)

  • Eric,

    Can you give us hints about the double-dot (..) operator you mentioned?

    I found no documentation about it, was unable to google it, and failed to use it in a watch expression.

    Help! :-)

    Omer.

    Whoops, turns out I was wrong. Apparently we cut that feature a long time ago I must have missed the memo. We were planning on having a feature in the debugger where you could say "myArray,[x..y]" in the watch window and then the debugger would show you just the portion of the array that you'd specified. Very handy for displaying chunks of large arrays. Looks like it was cut for lack of testing resources. Sorry to get your hopes up! -- Eric

  • I'm not haskell or anything other than C# guy, so/and I don't understand syntax from Stefan Wenig's new feature proposal. If I have lamda expression e.g.

    (source, selector) => source.Select(selector)

    of type

    Func<IEnumerable<TSource>, Func<TSource,TResult>, IEnumerable<TResult>>,

    I would expected simplified notation of that type name something like

    (IEnumerable<TSource>, (TSource -> TResult)) -> IEnumerable<TResult>.

    Why/what is IEnumerable<TSource> -> (TSource -> TResult) -> IEnumerable<TResult> i. e. "->" symbol between two input parameters?

    This is called "currying". Any function of two parameters can be turned into two functions of one parameter. If you have f = (x,y)=>x+y, and you call it f(2, 3), then you can turn that into g = x=>y=>x+y, and call it g(2)(3).  That is, the first call returns "y=>y+2". Technique is named after Haskell Curry, same guy the language is named after. I don't believe I haven't blogged about this yet. I'll get right on it. -- Eric

  • Thanks for explanation Eric, I'm looking forward to your next great posts already.

Page 2 of 2 (26 items) 12