Extension methods and Generics (Extension Methods Part 5)

Extension methods and Generics (Extension Methods Part 5)

Rate This
  • Comments 9

This is the 5th installment in my series of posts about extension methods. You can find links to the rest of the series here. Originally I had planned on discussing extension method versioning issues, but I've decided to postpone that topic to my next post and talk about extension methods and generics instead.

In Orcas we've introduced a new set of rules for the way we deal with generic extension methods that differs significantly from the way we deal with regular generic methods. When binding against extension methods we now perform generic type parameter inference in two passes instead of one. During the first pass we infer types for type parameters referenced by the first argument and during the second pass we infer types for any type parameters referenced by subsequent arguments. Internally we have been referring to this as "partial type inference" in our discussions about it. As an example, consider the following:

Imports System.Runtime.CompilerServices 

Module M1 
    Interface IFoo(Of T1, T2) 
    End Interface 

    
<Extension()> _ 
    Public Sub Bar(Of T1, T2, T3)(ByVal x As IFoo(Of T1, T3), ByVal y As T2, ByVal z As T3) 
    End Sub 

    Sub
Main() 
        Dim x As IFoo(Of Integer, String) = Nothing 
       
x.Bar("Hello", 2)
        Bar(x, "Hello", 2) 
    End Sub
End
Module

Using the Whidbey type inference algorithm the instance method call to Bar in Main would result in a compile time error. Attempting to resolve all types in one pass, the compiler would infer two conflicting types for T3, one from parameter X and another from parameter Z. With Orcas, however, the extension method call to Bar will not generate an error. Instead, the compiler will first infer types for T1 and T3 from parameter x and then substitute their values back into the procedure's signature. During the second pass of type inference the compiler will then treat the method as if it was a method with only one type parameter rather than a method with three type parameters. A value will be inferred for T3 from Z and then a conversion will be inserted to convert the integer arguments supplied to Y into the string value expected by the procedure.

Although this may seem like a complicated and perplexing rule, there is a bit of method to our madness. In particular it enables us to:

  1. Maintain transparency between extension methods and instance methods
  2. Avoid generating misleading intellisence information.
  3. Produce a better query-editing experience

Each of these reasons is discussed in detail below:

Extension Method Transparency

To see why the first reason is true is, let's consider the following:

<Extension()> _
Sub InsertTwice(Of T)(ByVal x As ICollection(Of T), ByVal y As T)
    x.Add(y)
    x.Add(y)
End Sub

Here we define an extension method named "InsertTwice" that applies to all implementations of the generic ICollection(of T) interface. Its implementation is rather trivial. That thing about it that is interesting is that its type parameter T is, in a sense, more of a "generic type parameter" than a "generic method parameter". Its primary purpose is to define the types of the objects to which the method is applicable, rather than to define the signature of the method itself. The fact that the there is a dependency between the method's signature and its "containing" class is a peculiarity of the method, not of the type parameter. The parameter's primary function is still to define the type. An alternative way to look at it is to consider how the method would look if it was defined as an instance method rather than as an extension method. Had "InsertTwice" been defined as instance method T would almost certainly be defined as a type parameter on a class rather than as a type parameter on a method. The need for T to be defined as a method parameter is simply just a side effect of the syntax to define extension methods. In fact, using this point of view it becomes clear that all type parameters referenced by the first argument of any extension method really are generic type parameters rather than generic method parameters. The goal of the compiler then, by splitting the inference up into two distinct steps, is really to restore things back to their correct state. The generic type parameters get treated as type parameters (they are inferred solely from the "me" parameter), and the generic method parameters get treated as method parameters (they are inferred from the method's arguments).

This does have some interesting implications. Mainly, it is now possible to invoke a generic method with some arguments explicitly provided and other arguments implicitly inferred. In particular, any type parameters that are determined to represent generic type parameters (because they are referenced by the first argument) are always implicitly inferred from the type of the object that the method was invoked on. The remaining type parameters, which are treated as generic method parameters, may then be explicitly supplied by the caller if necessary. However, the old Whidbey rules still apply to the group of parameters that are determined to belong to the method. Effectively, although it's possible to implicitly infer the "type" parameters and explicitly supply the "method" parameters, it is not possible to implicitly infer some method parameters and explicitly supply others.

Intellisence Usability

Moving on to reason #2 in our list above, this also fixes a small issue with intellisence. At the heart of our intellisence design is the principal that we never guide the user towards generating compiler errors. If an item shows up in a completion list in a given context then it is valid to use that item in that context. When we show information in a tool tip it is always up to date and accurate. This is really just an example of the IDE trying both to be nice and to avoid looking stupid. If we were to offer you something and then start to complain immediately after you selected it, then we would appear to not only be rude, but also to be stupid. This also means you can trust what the compiler tells you. If our "find symbol" says something doesn't exist or our "find all references" says something isn't used, then it doesn't exist or it isn't referenced. Similarly, if something does exists, or is being referenced somewhere in your code than we will report those things to you. If we show a tool tip, then the data in that tool tip is correct. At least that's how we design things to be. Realistically there always will be bugs that slip through the cracks, or things that aren't 100% perfect. However, we try our best to make the information that we give you as useful as possible.

Before we implemented the two step process for generic type inference in intellisence, some of the information we would show in intellisence was a bit misleading. Technically it was accurate but it was much less useful than it could be. To illustrate this, let's consider our "InsertTwice" example from above. If we were to invoke the method using the following code snippet:

Sub Main()
    Dim x As New List(Of String)
    x.InsertTwice(
End Sub

Then without our new rules we would end up showing a parameter info tool tip that looked like the following:



This, however, is a bit misleading. It indicates that InsertTwice is a generic method that can take an argument of any type T. This is, however, not true. The type of the argument is actually fixed to be string. Supplying any argument that was not String would yield a generic type inference error. More importantly, this error would then lead the author of this code to believe that he could fix the problem by explicitly providing the type of T:

x.InsertTwice(Of String)

However, this would only end up generating yet another generic type inference error, mainly because types inferred from x would then conflict with types inferred from y.  Obviously this hurt the usability of intellisence. With our new rules, however, we end up showing a parameter info tooltip that looks like this:



This is clearly much more useful.

Improved Query Support

The third, and perhaps most important, reason behind this decision is that it had a tremendous positive impact on our ability to support a good experience around using LINQ with VB. To see why, let’s consider a simple example:

Imports System.Linq

Module M1
    Sub Main()
        Dim xs As Integer() = {1, 2, 3, 4}
        Dim q = From x In xs Where x Mod 2 = 0 Select x
    End Sub
End
Module

Here we define a simple query that selects all even integers out of an array. When processing this query the compiler converts into the equivalent of the following code:

Dim q = xs.Where(Function(ByVal x) x Mod 2).Select(Function(ByVal x) x)

Which invokes 2 extension methods, one called Where that takes in a lambda that corresponds to the contents of the where clause, and another called select that takes in a lambda that corresponds to the contents of the select clause. As these queries are being typed the compiler will provide intellisence within each clause. Doing so, of course, requires identifying the element type of the collection being queried over so that its members may be appropriately shown in intellisence completion lists. Our new typing rules help make this easy. If we look at the signature of the Where method we can see why this is:

Public Function Where(Of T)(ByVal source As IEnumerable(Of T), ByVal predicate As Func(Of T, Boolean)) As IEnumerable(Of T)

It defines an extension method that is applicable to all implementations of the IEnumerable(of T) interface, taking in delegate that maps from type T to Boolean and returns another IEnumerable of the same type. If the old Whidbey rules were used to perform type inference on the call to Where, it would not be possible to determine the element type T of the collection until the entire where clause of the query had been processed, because the body of the where clause needs to be converted into a lambda and then wrapped up in a delegate type and passed off to the procedure as an argument. However, in order to assist query authors in writing a where clause, the IDE must be able to deduce the element type of the collection. This creates a obvious chicken-and-egg-situation: it becomes necessary to determine the value of T in order to be able to determine the value of T.

However, with our new rules this problem doesn't exist. With the new rules we can determine the element type of a collection just by looking at the collection its self, without expliciting having to bind all sub expressions first. This, of course, then makes it easy for us to provide accurate and useful intellisence inside our queries.

Caveats

Unfortunately, this does introduce a few minor issues that you need to be aware of. Mainly:

  1. Type parameters referenced by the first argument of an extension method may not be constrained by other type parameters that are not referenced by the first parameter of an extension method. This means that while the following extension methods would be legal:

    <Extension()> _
    Sub M1(Of T1, T2)(ByVal x As T1)
    End Sub

    <Extension()> _
    Sub M2(Of T1, T2 As T1)(ByVal x As IFoo(Of T1, T2))
    End Sub


    that this one, on the other hand, would not:

    <Extension()> _
    Sub M3(Of T1, T2 As T1)(ByVal x As T2)
    End Sub
  2. C# does not implement these rules, so any extension methods defined in C# that violate the above restriction will not be visible to VB programmers.

In general, however, we feel that these restrictions are largely outweighed by the benefits these rules introduce.

Side Note

On a side note, the instance method call I mention in my first example will also no longer be an error in Orcas. This, however, has to do with a different change we are introducing to resolve type inference conflicts that I won't delve into with this post. It is sufficient, however, to note that the rules for type inference in the instance method case will continue to use an "infer-everything-at-once-approach", where as extension methods will use the two step algorithm I outline here.

In any case, I think that's enough for today.

Stay tuned for my next post, where I will discuss some best practices for using extension methods in your programs.

 

Leave a Comment
  • Please add 3 and 7 and type the answer here:
  • Post
  • PingBack from http://blogs.msdn.com/vbteam/pages/articles-about-extension-methods.aspx

  • Pingback from http://oakleafblog.blogspot.com/2007/02/vb-9-extension-methods-documentation.html

  • I would like to see a keyword or something like c# 'this' modifier instead ExtensionMethod attribute. It's annoying have to imports the System.Runtime.CompilerServices namespace and writing the attribute.

    I guess that would be easy for the compiler to do this. A 'Extension' keyword would be like the 'Shared' keyword.

    Ex.:

    Public Extension Sub Times(ByVal x As Integer, ByVal d As DelegateSub)

       For i = 1 To x

           d()

       Next

    End Sub

    or

    Public Extension Function AddValue(ByVal x As Integer, ByVal y as Integer)

      Return x + y

    End Function

    or

    Public Function AddValue(ByVal Me x As Integer, ByVal y as Integer)

      Return x + y

    End Function

    Tony

  • Tony,

    Thanks for your feedback. We probably will not be able to make a change like this for Orcas,

    but I will forward your request over to our langauge design team for consideration in a

    future version of the product.

    Thanks

    -Scott

  • I hope that Orcas will support non base zero arrays and other very cool stuff of the glorious VB6 like the object Shape.

    Ciao

    Nicola

  • Excellent post. This is type of info that can be hard to come by but which allows a much deeper understanding of what's going on. Thanks.

  • The functionality of this feature will really help me.

    But I wonder about two design decisions.

    1) Extension method definitions are explicitly labeled as such.

    This forces the person who provides a utility method, i.e. any Public Shared method with arguments, to explicitly decide, at the time the method is coded, whether it will be available as an extension method.  It is not clear to me whether this is the right place and time to make that judgment.

    Moreover, I do not see why the decision has to be made in the first place.  Why not simply regard *any* Shared method as an extension method on the type of its first argument?  That way, even pre-existing Shared methods (of which there are quite a few) will be extension methods.

    I'm sure there are arguments against it, but I can't think of any.  Will there be far too many extension methods for the developer to deal with?  Doesn't seem likely.  If there are, is it too difficult to provide filtering facilities?  I'm not sure.

    2) Extension method calls are *not* explicitly labeled as such.

    This series discusses how the semantics of extension method calls is subtly different from normal method calls, and it points out that name clashes between the two types are silently resolved in favor of normal methods.  There appears to be a potential risk of confusion for the developer here.  One easy resolution is to make extension method calls syntactically distinct from normal method calls, e.g. by using @methodname(args) instead of

    .methodname(args).

    Again, I feel I can't really see beyond this single observation, and the VB team must have some pros and cons in mind that I'm overlooking.

    Would you care to share some of your thinking regarding these two points?

  • I think the big question is whether all those sucky Shared methods on Array, Enum, and Char will ever be callable using intuitive instance method conventions. It is for this reason alone I would almost welcome the idea of the consumer deciding whether to us extension methods. Sometimes class library writers, God bless them, just overlook little tiny features of usability.

  • url website home domain http://latoniawarford.00me.com/ url

Page 1 of 1 (9 items)