Welcome to MSDN Blogs Sign in | Join | Help

Joining Halo Team

After spending 4 years in the C# compiler team, i have decided its time for me to try something new. A few weeks back i joined in the Halo team and have started my exciting journey there.

I won’t be posting anything new about C# in the near future and the don’t know what shareable halo info i will have. Thanks to all the people who read my blog and kept it entertaining, please feel free to ask me any questions. I will certainly try to answer ( or redirect to my ex-colleagues in C#).

Moving on to Halo, i hope you have seen this ODST trailer if not check it out, ODST has a band of brothers kind of feel and connects at a visceral level.

Posted by Sree_c | 3 Comments

Why Can’t Extension methods on Value Type be curried

This is a followup to an post Extension Methods and Curried delegates. I have been recently asked if why Error CS1113: “Extension methods 'Name' defined on value type 'typename' cannot be used to create delegates” was added and what does it mean ?

Here is the sort version of the story, basically the there is no direct translation to IL that would be correct for the operation( an extension method defined on a value type to be current as an instance method and assigned to a delegate).

To try this out one can generated IL like this

IL_0006: nop

IL_0007: ldc.i4.s 10

IL_0009: stloc.0

IL_000a: ldloc.0

IL_000b: box [mscorlib]System.Int32

IL_0010: ldftn int32 Test.Extension::func(int32)

IL_0016: newobj instance void Test.d1::.ctor(object,native int)

here the value of the int passed to the extension method is something like 205XXXXX

The problem is that value type methods (with the exception of the virtuals inherited from System.Object) are typically called with a reference to the value type, not a boxed version of the value type (this allows meaningful mutation of the value type for instance). (When a non-virtual method on a boxed value type is called directly the JIT and runtime actually conspire to route the call through a thunk called an “unboxing stub” which essentially adjusts the this pointer downwards to make it look like a value type reference so we can call the real value type method implementation).

Delegates essentially force you to box any bound parameter you supply (since the .ctor signature takes an Object and internally we store and report this value as an object reference). So straightaway you have a potential semantic problem: by forcing the value type reference into a box you lose any identity (i.e. if a mutating method is called on the boxed value it will mutate the boxed copy, not the original).

But then there’s an additional problem. On the invocation side there are three potential ways a boxed value type argument could be processed:

1. The boxed value is passed in directly. That’s what you want if it’s intended to be the first argument of a static method which is typed as Object or ValueType, or if you’re calling one of the System.Object virtuals on a value type. For instance “static void someType.foo(Object)” or “override String valueType.ToString()”.

2. The boxed value is unboxed and passed in. That’s what you want if calling a static method that with a first argument typed specifically as your value type. For instance “static void someType.foo(int32)”.

3. A reference to the value type stored inside the boxed value is passed in. This is what you want if you’re calling a non-virtual instance method on the value type. E.g. “instance void valueType.Foo()”.

I believe CLR handles cases 1 and 3, but not case 2 (which is the case above). There’s a technical problem with the implementation here, This has to do with the way CLR implements delegates to get the fastest invocation speed possible.

One way I can think to overcome this problem is to create a stub our selfs. That is we could leverage the lambda implementation to create a lambda that calls the static method and then use the lambda as the delegate value.

For anyone Searching the web for this, Hope this helps :)

Debugging Dynamic objects in C# Part 1

After a long time spent working on dev 10 features and fixing the may big and small things i have finally had the time to cobble togather a post.

What’s this post about ?

To begin with i will be talking about debugging dynamic objects.

With C# 4.0 we can instantiate and perform operations on objects  from dynamic languages like iron python, Iron ruby by using the dynamic keyword. In this post i will show you the tools added to  inspection these objects and how one can reuse this for debugging COM objects.

Introduction to Dynamic

Basically the dynamic keyword can be used to define the type of a local, field, return type or parameter. By doing so the user allows the compiler to defers the binding of calls and member accesses to run-time. In order to do this the compiler does some code generation that may or may be interesting to you ( i will briefly touch on this later), but what is sure to pop to ones mind is “ What’s this good for ? ”.

The primary use of this to bridge the gap between the .Net type system and its set of strongly typed languages and other alternate type systems that are in use by the developer to get his/her job one ( think COM, Scripting runtimes , DOM etc). The compiler does this by packaging all the information about the call in a payload and generates a call to the DLR  (dynamic language runtime). This call is executed at runtime where the DLR takes charge of routing the call to the correct provider (IronPython, COM, IronRuby) of the object. This provider  then tries to bind the call and execute it.

What’s the Problem with Debugging this ?

Consider the following code

using System;
using Microsoft.Scripting;//requires reference to Microsoft.Scripting.dll and IronPython.dll
using IronPython.Compiler;
using IronPython.Hosting;

public class C
{
    static void Main(string[] args)
    {
        var sr = Python.CreateEngine();
        // similar overload exists for an file instead of a inline string
        var code = sr.CreateScriptSourceFromString(@"   
class x(object):
 class_val = 5
 def some_method(self): pass
 def meth1(self, foo): 
   return foo
a = x()
a.val = 3
a.Name = 'Sree'", SourceCodeKind.Statements);
        var scope = sr.CreateScope();
        code.Execute(scope);
        dynamic y = scope.GetVariable("a");
        System.Diagnostics.Debugger.Break();

    }
}
 

Try inspecting y,

image

since we are talking about dynamic objects by nature they don’t have a strong type that representaion the structure of the object. In this case i am using Iron python, which happens to represent the members in a hash table.

In the case of COM the structure of the object is actually in an alternate type system.

dynamic XlApplication = new Microsoft.Office.Interop.Excel.Application();
dynamic thisWorkbook = XlApplication.Workbooks.Add();
dynamic xlSheet = thisWorkbook.Worksheets.Add(After: thisWorkbook.ActiveSheet);
System.Diagnostics.Debugger.Break();

Try inspecting xlSheet,

image

In either case the basic debugging tools in VS fail as they have no way of discovering this and only end showing the most basic information, which in almost every case is useless.

In Conclusion

  1. Dynamic languages such as IronRuby or IronPython do not have strong types that represent the members they contain
  2. COM support is limited to Interop assembly generated and quickly one can fall of the cliff with no type information.
  3. The IntelliSense® tool does not show any of the dynamic members ( since it’s driven by the static type system).

debugging can be hampered by an inability to query the state of the object while stepping though code.

So what does the solution look like ?

Dynamic View

In addition to the static view, the C# expression evaluator will add a special node for 

  1. Objects implementing IDynamicObject
  2. COM objects with out a PIA or strong type (Primary interop Assembly)

named “Dynamic View”. This is similar to the “Results View” node that was added for LINQ debugging, where the user could see the results of the lazy evaluated queries of linq. 

The “Dynamic View” node has the following behavior.

  1. Expanding the Dynamic View node will query the IDynamicObject object for the members available for display and get the value for each.

image

  1. The children of the Dynamic View are immutable.
  2. Choosing “Add to Watch” for any child of the Dynamic View inserts a new watch that casts the object to dynamic (i.e. ((dynamic)y).Name).
  3. The Dynamic View and its children are assumed to be side-effecting, and therefore stepping in the debugger will not automatically re-evaluate the Dynamic View or its children; they are made stale.
  4. The Dynamic View and/or its children will need to be manually refreshed by the user.

A “dynamic” format specifier has been added to the expression evaluator. This allows the Dynamic View to be added to the watch by suffixing expressions with “, dynamic”

image

Pdb support

In response to this implementation, a PDB payload has been added for local variables typed in some way as dynamic. This payload is necessary since there is no metadata corresponding to locals and since there no actual type called dynamic anywhere in the CLR. The compiler must find some way to decorate the locals to infer that its a dynamic local at runtime. The payload includes the name of the local, its slot ID and a boolean array (length<=128) that represents a pre-order traversal of the type. For example, consider:

Dictionary<int, dynamic> dict = new Dictionary<int, dynamic>();

This code will produce a boolean array of false,false,true to indicate the portion of the type that is dynamic.

So remember if you want to debug dynamic locals be sure you generate the debug information(pdb) and have it where you can access it

With regard to declaration (as opposed to inspection), in some embodiments, declaration of dynamic variables within an Immediate window is supported, including complex generic types (e.g. Dictionary<int, List<dynamic>>). But more on this later …

More later … cheers

kick it on DotNetKicks.com
Posted by Sree_c | 2 Comments
Filed under: ,

C# Debugging Improvement for VS 2008 SP1- Part II

Anonymous Types

On Further review there are a few problems with anonymous-types, they all boil down to the fact the names given to these types are not valid C# type names ( so that users don't explicitly use them in code). But while debugging this is exactly the kind of thing that one wants to do, Consider the following cases

Case1:

The anonymous type appears in a cast, like when an anonymous type is returned from a function or when its cast from System.Object to the actual type.

var obj = Func(new{ I = 10, J = "Sree"});
System.Diagnostics.Debugger.Break(); 

private static object Func<T>(T obj)
{
       object o = obj;
       System.Diagnostics.Debugger.Break();
       return obj;
}

At the first breakpoint add to watch any member on o results in

image

At the second break point adding to watch any member on obj results in

image

Case 2:

The user wished to create a instance of the anonymous type during a debugging session or when the Anonymous Type is a type argument on a generic type and the user want to evaluate a static member, use it in a cast etc. 

SP1 Changes

The anonymous type names are no longer invalid in the Expression evaluator. Therefor

  1. C# Constructs involving Anonymous-types can actually be evaluated at runtime and
  2. Instances of Anonymous-types can actually be created during a debugging session.

Which means using the same example as above and trying it with SP1 produces

image

See how the anonymous type name is no longer hidden and is used in the cast to bind to the members.

As an added bonus while debugging you can now create instances of anonymous types on the fly and use them. Assign them to objects and check for pathological conditions etc

image

Hope you enjoy discovering and debugging anonymous types now, that they are much more useful in the debugger.

As always do let me know if there are others things you would like to see me improve or implement.

kick it on DotNetKicks.com
Posted by Sree_c | 18 Comments
Filed under: , ,

C# Debugging Improvements for VS 2008 SP1- Part 1

Overview

Over the past few months I have been busy closing VS 2008 and working on some fixes for SP1. We have enabled some key debugging scenarios in C# in VS 2008 SP1, they include support for 

  1. Range Variables in Queries &
  2. Anonymous Types
  3. Generic Type arguments

Covering all of them was making the post long so i am going to discuss Range variables this time and continue on with the other 2 in the next post.

Range variables

Range variables are the variables defined and used in a query. Currently they have a limitation in that they are not recognized as "true" local variables when stopped/stepping though a query. And since queries are one of the newest constructs in C#, are a prime candidate for SP1.

Range variables cannot currently be inspected in the watch window or in data tips(hover over them), quick watch and immediate. They can only be seen in the locals window,  but the display shows the internal transparent identifier created for the query. This leaves the variable of interest to the user nested and difficult to understand and use.  This happens as long as the user adds

1. A second from,

int[] ints = new int[] { 1, 2, 3, 45, 56 };

var q = from i in ints
            from k in ints
            where i > k      //add BP here
            select i + k;

clip_image002

2. A join or

image

3. A let clause to their query,

image

Seeing these simple examples it is not hard to imagine the debugging state when stopped in queries that use many of these in combination.

image

Here "Hover over" does not work, can't add anything to watch, the transparent identifier gets larger and the lvl of nesting makes it very hard to find the range variables your looking for.

SP1 Changes

All of these issues are now fixed in SP1 and the user has the full set of debugging tools for him to help inspect and evaluate the current state when stopped at a break-point. using the same example

image

Conclusion

In concluding this article, As always I am very interested in

  1. Any others pain points you have discovered while debugging or
  2. Things are are hard to debug.
  3. Things that would be nice to inspect etc.

Looking forward to your feedback. Cheers

kick it on DotNetKicks.com
Posted by Sree_c | 17 Comments

Debugging C# 3.0 Part II

Overview

In the last article I covered the "results view" for lazy evaluated collections like Queries/Enumerable and the use of extension methods in the watch and immediate window. For completeness I will cover stepping, range variables, anonymous types and Xlinq support and this will lead us into the SP1 work for debugging.

Stepping

Stepping has been extended to support the query execution. That is, when the query is being executed one can step though the query created or add break points for the same. This allows the user to actually see the flow of the execution though a query and find out how a particular element was retrieved from the enumerable.

image

The results view is marked as having side effects ( just like a method call) and is disabled during a step. This mean we don't have unwanted side effects or seeing the contents of a enumerable

image

Range Variables

The variables that are declared and used in queries can also be inspected when stopped in a query execution ( as shown in stepping). Here the variables in question might be nested under an transparent identifier and if so the top level transparent identifier is show in the locals window and the user can dig in though this object to find the variable he is looking for.

image

Anonymous Types

Since anonymous type are identified by their structure all instances of anonymous-types give a brief description of their structure (first 5 elements) in the value column. The name of the type is hidden as its an unparseable name in C#. All elements of the anonymous type are un editable( after the immutable dcr)

Xlinq Objects

Objects of type System.Xml.Linq.XNode or any derived type get the built in visualizers ( text, XML and html) and user can see the Xml data, with a few clicks.

This pretty much covers the work done to support debugging in VS 9, however as I will describe soon this leaves a pretty big hole when debugging queries where the range variables are under an transparent identifiers ( e.g. Color, agg, y etc from range variable eg). Secondly since anonymous types are unparseable any construct that contains an anonymous type ( think type argument , casts , ctors) are broken. These are some of the issues i will cover in my next post of VS 9 SP1.

kick it on DotNetKicks.com
Posted by Sree_c | 3 Comments

Extension methods Interoperability between languages

Extension methods written in C# can be imported and called with Extension method semantics in VB and vice versa. This is possible since me decorate the assemblies , types and methods in the same manner. Using the Attribute

[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class | AttributeTargets.Assembly)]

public sealed class ExtensionAttribute : Attribute { } 

This should be a special note to developers writing their own compilers or doing IL gen. This decoration allows the importer to precisely identify the classes to import when looking for extension methods. Allowing it to omit unnecessary types and their dependencies when looking for extension methods, which can quickly addup.

So if your doing anything custom be sure to inspect the IL to see if other languages can discover and call your extension methods seamlessly

Posted by Sree_c | 4 Comments
Filed under: ,

Conversion rules for Instance parameters and their impact

Overview: Instance parameter is the first parameter of an extension method and has the "this" parameter modifier. I discuss special conversion rules for them and some of the things that users of extension methods might encounter.

Consider the code bellow

using System;
using System.Linq;

namespace TestExtensions
{
    class Program
    {
        static void Main()
        {
            0.Foo();    // 1
            0f.Foo();
            0d.Foo();

            A.Foo(0);   // 2
            A.Foo(0f);
            A.Foo(0d);
        }
    }

    public static class A
    {
        public static void Foo(this long x) { Console.WriteLine("Long"); }
        //public static void Foo(this int x) { Console.WriteLine("int"); }
        public static void Foo(this object x) { Console.WriteLine("Object"); }
        public static void Foo(this double x) { Console.WriteLine("decimal"); }
        public static void Foo(this float x) { Console.WriteLine("float"); }
        public static void Foo(this short x) { Console.WriteLine("short"); }
    }
}

1 binds to Foo(object) and 2 binds to Foo(short). This means that the same method when used as an extension method binds to a different overload than if used as an static method.

That raise the question Why is that ?

This is so because the conversions rules for instance parameter in an Extension methods are

  1. Identity conversions
  2. Implicit reference conversions
  3. Boxing conversions

So we are not considering numeric conversions and this can cause some confusion. 0 is of type int and will not be implicitly converted to long while binding the method as an Extension Method.

Extension methods have different semantics than standard static methods, since they are called by the client as an instance method. So the conversion rules are modified so that they behave as such.

Lets say we define an instance method on int does it mean I can call it on a object of type long ? It is this kind of enforcement that the conversion rules enforce.

uncommenting the perfect match Foo(this int x) and they behave the same ...

Posted by Sree_c | 2 Comments
Filed under: ,

Debugging and Delayed Execution in C# 3.0

Overview:

C# 3.0 added a few constructs like queries which are delay executed. This means that they are not actually executed until the results of the query are required. Debugging some of them can seem strange since one can't step in to the Query where its created but only where its enumerated, like in a foreach loop. In this article i will show some of the problems that the user can face based on the fact that the Query is delay executed and the debugger tries to be as non-intrusive as possible.

So what does it mean to step over the creation of the query, does it mean the statements that are part of the Query are executed... as you would have guessed No. At this point we have just created delegates that point to code in the Query(the body of the select, where etc), these delegate form the arguments to the extension methods (select , where etc )that form the query.

Consider the following example:

Define a Class Library and add a single extension method to it

using System;
using System.Linq;

public static class Extensions
{
    public static string EM(this UInt32 i)
    {
        return "hello";
    }
}

Now create a console application and add the class lib as a reference. In the console application add the following code ..

using System;
using System.Linq;

namespace lateboundExtensionmethods
{
    class Program
    {
        static void Main(string[] args)
        {
            var q = from i in new uint[] { 1 }
                 select i.EM();                //module containing EM is not loaded

            System.Diagnostics.Debugger.Break();  
            foreach (var v in q)           //module containing EM is loaded
                Console.WriteLine(v);
        }
    }
}

Now run to the breakpoint and try and evaluate i.EM() in the watch window.... we get an error like

 

What was all that about .... have i not just compiled and ran this code ... What's the debugger smoking ?

The Problem:

Though the problem is not specific to Extension methods. The Query rewrite rules move code in the select, where etc in to a delegate and in order to do that a new static method is generated. Until this Method is Jitted the assemblies referenced in it are not loaded. Since the assembly is not yet loaded in the debugging session the Expression evaluator can't find the methods, if they are added to watch. This leave the user feeling like he has stepped over a code that can't be evaluated in the debugger. once we come to the foreach loop the code in the select is actually need to be stepped into and the Jitter jits this method, this results in all the assemblies referenced in to be loaded.

On the other hand trying something like this, and the extension method in watch will work.

using System;
using System.Linq;

namespace lateboundExtensionmethods
{
    class Program
    {
        static void Main(string[] args)
        {
            uint j = 10;
            System.Diagnostics.Debugger.Break(); //module containing EM is loaded
            j.EM();

            System.Diagnostics.Debugger.Break();

        }
    }
}

Workarounds:

So while debugging you code if you are unable to execute an static or extension method defined in a satellite assembly check the modules window to see if the assembly is loaded, if not for the debugging session you might want to add something like typeof(className) to you'r code. Where className is the class defined in the satellite assembly.

kick it on DotNetKicks.com
Posted by Sree_c | 1 Comments
Filed under: ,

Debugging Features in C# 3.0 Part 1

Overview

C# 3.0 introduces many new constructs and opens entirely new ways of thinking and developing code. In this article I will talk about the new debugging features that make it easy to see the running code and better understand it. In my experience one spends as much time writing the initial code as debugging it later. In fact stepping though the code and inspecting variables and expressions is one of the best ways to understand it. So its not just the original writer but many more who will eventually debug it. To this end there are many new things that you will notice when you start debugging in a C# 3.0 project, I will enumerate few of them in this part of the article.

Results View

The primary aim of the results view is to provide a array like view for Enumerable objects. This allows the user to see the items that are the results of the Query. The Query/Enumeration might be on a local collection(Linq/Xlinq/iterators) or from a remote store(Dlinq/customProviders). The idea here is not to change/hide the object's default structure while adding a simple way to inspect the result. This is quite different from debugger proxies which hide the actual view of the object under a "raw" node and show the users the custom view. The rational being that there are many useful properties/fields on the users query and we wanted to add this functionality without loosing all that information. This brings up 2 interesting questions.

1. When is this applied?

The Results View is applied to an object/struct if

  1. It does not have a Debugger type proxy 
  2. Does not implement IList or ICollection (they already have items view)  
  3. It does implement IEnumerable or IEnumerable<T>.

2. What does the View look like?

  1. For a object whose compile time type is an interface of type IEnumerable, IEnumerable<T>, IQueryable or IQueryable<T>, we hide the derived node and show the runtime type’s  members directly under the object  .
  2. A new node called “Results View” is added to the Expression which when expanded will enumerate the Enumerable.    
  3. The Results View's value column warns the user that expanding will change the state of the object being inspected.
  4. A new format specifier is added to represent the view and can be accessed by typing object/expression, results.

E.g. for query like

int[] array  = new int[]{ 23, 3,54, 8, 10, 39, 87, 3, 7};
var q = from i in array.AsQueryable()
        let y = i * i
        let z = y * y
        select new { y, i, z };

System.Diagnostics.Debugger.Break();

 

here we can see the Results View Node is added as a child of the Query q. Though q has a static type of IQueryable<SomeAnontype>, we directly show the members of the actual runtime type. The value column has a important message for the user, stating that expanding this view will enumerate the object.If the base of the expression in watch is also an enumerable the results view is added to it. If a field/property which implements IEnumerable is hidden then the results view inherits the hidden property from its parent.

Finally we can also access the results view by using the results specifier “, results”.

There are some exceptions to the rule, Sytem.String is excluded from having this view though it implements IEnumerable.

Extension Methods

Extension methods can now be used when calling an instance method in the watch and the immediate window. This is very useful if you want to  see

  1. Which extension method will get called if the statement was placed here in the code.
  2. The results of a Extension method not present in the code being executed.
  3. Limit the number of elements returned by a query, before expanding its results.
  4. Most importantly since Queries boil down to extension methods, we can use the watch/immediate like scratch pad to try out some Ad-hoc Query like behavior.  

Consider the following code

    public class Program
    {
        static void Main(string[] args)
        {
            int[] array = new int[] { 1, 23, 45, 67, 12 };      // simple in memory array

            Func<Func<int, bool>, IEnumerable<int>> where = array.Where<int>;  //curry an Extension method

            System.Func<int, int> f = Program.identity;
            System.Diagnostics.Debugger.Break();
        }

        public static bool Eval(int i){ return i > 5; }

        public static bool Eval1(int val) { return val < 50; }

        public static int identity(int x) { return x; }
    }

Trim the Enumerable before inspecting it ( very useful if this call will be remoted to the provider of the extension method)

Calling an Extension methods with a method (predicate) as argument

Using Curried delegate from the code

Chaining extension methods to archive Ad-hoc query like execution

This assumes that the assembly contain the extension method is already loaded by an earlier runtime call.

All in all using these 2 features in combination allows users to do some powerful analysis on a  program while it executes. Though not as powerfully as a full fledged Query Analyzer with lambdas, it does allows the user to do very interesting stuff. I am interested in knowing potential pain points that you come across, things that are in your way when debugging etc.

In the coming article i will cover Stepping, Anonymous-types, Range-variables and Xlinq Support.

kick it on DotNetKicks.com
Posted by Sree_c | 7 Comments
Filed under: ,

Learning C#

I was recently asked by a developer "I know C++ how do i get into C# and .Net".

If you want to understand the language design and its inner workings I would suggest The C# programing Language.

If you want to use .Net and C# try Practical .Net and C#.

 

For the New feature in C#3.0 (lambdas, extension methods, object initilaizers, LINQ etc) check out the 3.0 Spec for the overview and C# site for the latest stuff.(blogs, articles, videos and what not)

 

As with anything write some applications using either beta1 or VS2005

and feel free to post back with questions :)

 

Posted by Sree_c | 4 Comments
Filed under:

Extension methods and Curried Delegates

Delegates 

Since Extension methods behave like instance method it makes sense that we should be able to create delegates that would accept the instance method signature, to this end we have included Adding an Extension Methods to delegate invocation List   

Extension methods can now be used like an instance methods when being added to a delegate invocation list. So for Extension method Bar defined as    

static int Bar(this foo f)

{

    return f.val;

}

 

We will now allow delegates that accept the instance method signature to be able to add Extension methods to their invocation list. So for,

delegate int d1();

We can now do

d1 d = f.Bar;

In order to do these the generated IL will look like so   

IL_0000:  nop

IL_0001:  newobj     instance void Test.foo::.ctor()

IL_0006:  stloc.0

IL_0007:  ldloc.0

IL_0008:  ldftn      int32 Test.Extension::Bar(class Test.foo)

IL_000e:  newobj     instance void Test.d1::.ctor(object,   native int)

 

Where Test.d1 is the delegate class that we generate. That is I am treating the static method as an instance method and the instance variable is curried to the delegate.   

Only Extension methods on reference types are supported for this feature. So you can pass a value type as the instance argument to the extension methods, but as long as its boxed.

I encourage users to try this feature in Beta1 released a few weeks back, and give feedback ...  

kick it on DotNetKicks.com

Extension method Binding and Error reporting

Overview:

Extension methods are static methods that are bound with instance semantics. In this article i will give a brief overview of the various steps involved in binding a extension methods. Finally this will prepare the way to discuss the error reporting for extension methods and how these error messages can be used to diagnose the problem at hand.

All methods calls bound in the compiler go though a 3 phase,

  1. The methods to be bound are defined in source or imported,
  2. A lookup for their name finds them and
  3. The method passes applicability for the given arguments.
  • Lookup: This is the phase where the method name is searched on the type and its base classes. In case of an instance method the object on the lhs provides the type,

eg  object.MethodName( arg1, arg2 ) becomes

      lhs  = object ,

      rhs = MethodName  

whereas for static method the type is explicitly specified. In either case the compiler looks for a method of the given name(identifier), on finding one it checks for access from the location of invocation, arity count (number of type parameters) etc.  On finding a method that match, it is added to a method group. This method group (roughly collection of methods) is the result of a lookup. Therefore lookup error in C# read roughly like "Type blaa does not contain method blaa".

  • Applicability: This is the phase where the given arguments are matched with the parameters of the method. This can result in 0 , 1 or more methods being found applicable. 0 is a binding failure and the closest match is provided for the error message. If there is exactly 1 then we can bind the call and if more than 1 then we try a best-ness algorithm. If none of the methods are better than other we have an ambiguous call else we have a successful binding. Therefore applicability errors in C# read like "best overload for method blaa has some invalid arguments". 

Enter Extension methods

Definition

Extension methods can be defined in source or imported from external assemblies. If imported from external assemblies they are recognized as extension methods based on the extension attribute that decorates them.

[AttributeUsage(AttributeTargets.Method | AttributeTargets.Class | AttributeTargets.Assembly)]

public sealed class ExtensionAttribute : Attribute { }

This attribute is placed on the Extension Method, type and assembly by C# and VB compiler. This allows extension methods define in C# to be found and bound in VB and vice versa. Anyone developing a compiler for a language what supports extension method should make sure they decorate the extension methods with this attribute for inter-operability. The Attribute on the assembly and type are useful for tools like the object browser that must scan the .net framework assemblies looking for extension methods. (this considerabally reduced the complexity of their search). C# will not support the explicit usage of the Extension attribute in code. On finding the extension attribute on the method, type or assembly the compiler will throw an error and no IL will be generated (CS1112).

Once imported or defined in source the extension method is added to a cache. This cache will be used by lookup in compiler, language service (Intellisense) or the C# debugging components (watch/locals) for binding.

Lookup

When binding instance methods or delegate invocation the lookup can include extension methods in the results. On not finding any instance methods with the matching name, the lookup turns to extension methods. Extension Methods are first searched for in

1.     The innermost namespace where the call is to be bound,

2.      Then the namespaces imported by the "using clauses”.

3.     This process continues moving outwards until we reach the topmost namespace.

Therefore lookup create a list of extension methods for the innermost namespace and this will be the first list on which applicability will be tried. Then on the list for all the extensions imported by the using clauses in the namespace. So on ... Extension method lookup therefore creates an ordered list of lists containing methods.

Applicability 

Applicability will consider extension methods for binding in either of these 2 cases

1.     Only extension methods were returned by lookup.

2.     Instance methods returned by lookup were not applicable.

In case 2 applicability will actually call lookup and being the search for extension methods. This delays the potentially costly search for extension methods to when it is actually required. Applicability for extension methods is calculated by taking the instance object for the call and using it as the first argument to bind the methods. The first parameter and the instance object has a special conversion rule (covered in the previous post). If the arguments match the parameters on the extension method the applicability continues the binding process for all the methods that belong to the current Namespace list. On reaching the end of the list we try and see if we have one method that can be considered best if so the binding is successful else ambiguous.  

Error Reporting

Now that we know how the compiler finds and binds the instance & extension methods calls written by the user, let’s see what happens when things go wrong. For me good error messages are one of the cool features of C# and keeping the quality high is always an strong motivation. Since extension method binding might do things differently than instance methods it is specially important that the user be able to diagnose what the source of the problem is. To this end many of the old error message like CS1061 have been modified to mention extension methods, new messages have been added and at a few places the type of error message thrown is changed. I will try and mention a example for each of these cases and illustrate my point. In each of the examples below show the actual error in the green, the new and improved error message in blue and the old and confusing error message in red. Go ahead read though and be the judge….

Lookup Error

Whenever extensions methods are applicable for a member lookup (using instance method semantics), on not finding any members that matched the given name we give the new error message.

 

"'%1!ls!' does not contain a definition for '%2!ls!' and no extension method '%2!ls!' accepting a first argument of type '%1!ls!' could be found (are you missing a using directive or an assembly reference?)"

 

Applicability Errors

   

1.       The Error reporting for extension methods has been improved to take into account the conversion of the instance argument to the first param of the extension method. If this fails then the extension method is not present on the type in question and we give a error message that looks like a lookup error. This error is also used by the Query Error reporting to give the error lile " Could not find an implementation of the query pattern for source type %1  ... “

 

E.g.   

var list = new ArrayList { 0, 1, 2, 3, 4, 5 };

list.Select((x) => x + 1).Where((x) => x > 5); // remember no extension methods work for arraylist try list<T> instead

 

Results in

error CS1061: 'System.Collections.ArrayList' does not contain a definition for 'Select' and no extension method 'Select' accepting a first argument of type 'System.Collections.ArrayList' could be found (are you missing a using directive or an assembly reference?)

And not (Beta1)

error CS0411: The type arguments for method 'System.Linq.Queryable.Select<TSource,TResult>(System.Linq.IQueryable<TSource>, System.Linq.Expressions.Expression<System.Linq.Func<TSource,TResult>>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

   

IEnumerable i;// no extension methods define for Enumerable only Enumerable<T>

i.Select((x) => x);

 

Results in

error CS1061:  'System.Collections.IEnumerable' does not contain a definition for 'Select' and no extension method 'Select' accepting a first argument of type 'System.Collections.IEnumerable' could be found (are you missing a using directive or an assembly  reference?)  and not binding error.

And not(Beta1)

error CS0411: The type arguments for method 'System.Linq.Queryable.Select<TSource,TResult>(System.Linq.IQueryable<TSource>, System.Linq.Expressions.Expression<System.Linq.Func<TSource,TResult>>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

 

   

2.       There is a new error messages for when the arguments binding fails for Extension method's Instance arguments. Such an extension method is not considered for error reporting if one with a match on the instance parameter can be found.

 

Consider   

Test t = new Test();

double doub = 2.4;

t.Calc(doub);   //can’t convert from Test to the derived type DTest

 

Where

public class DTest: Test

{}

And

static class Extension

{

   public static double Calc(this DTest obj, double i)

   { return i;}

}

   

Result in error

error CS12533: 'Test' does not contain a definition for 'Calc' and the best extension method overload 'Extension.Calc(DTest, double)' has some invalid arguments

error CS12534: Instance argument: cannot convert from 'Test' to 'DTest'

And not (Beta1)

error CS1502: The best overloaded method match for 'TestLinq.Extension.Calc(TestLinq.DTest, double)' has some invalid arguments

error CS1503: Argument '1': cannot convert from 'TestLinq.Test' to 'TestLinq.DTest'

For Type Inference Failures

 Extension methods are unique in the way that the Applicability test determines if the Extension method is defined for the type. That is lookup only determines if we found an extension method with the given name and not if the receiver can be converted to the instance prams of the extension method. Therefore when the type inference for an extension method fails on the instance argument, it means that the extension method is actually not defined for the type and a lookup error should be given. And similarly when the type inference fails arguments other than the receiver we should only show the extension methods that match the receiver’s type with the instance prams.

   

 

   var list1 = new List<int> { 1, 2, 3, 4 };

   q = from x in list1

        from y in 5  //we need a collection here something like new int[]{5}

        select x;

Or

list1.SelectMany(y => 5, (x, y) => x);

   

Will result in error

Error CS0411: The type arguments for method 'System.Linq.Enumerable.SelectMany <TSource,TCollection,TResult> (System.collections.Generic.IEnumerable<TSource>,System.Func TSource, System.Collections.Generic.IEnumerable<TCollection>>    System.Func<TSource,TCollection,TResult>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

 

And not( Beta1)

Error CS0411: The type arguments for method 'System.Linq.Enumerable.SelectMany <TSource,TCollection,TResult> (System.collections.Generic.IQueryable<TSource>,System.Func TSource, System.Collections.Generic.IQueryable<TCollection>>    System.Func<TSource,TCollection,TResult>)' cannot be inferred from the usage. Try specifying the type arguments explicitly.

 

 I hope this helps improve the understanding of error reporting for extension methods and give insights into diagnosing the real problem when you see a error message that says "method does not exist on type" or some such thing...:)

kick it on DotNetKicks.com
Posted by Sree_c | 3 Comments
Filed under:

Extension methods in C#

Overview 

Extension methods are a new feature for C# 3.0 and I had the opportunity to implement them in the Compiler. These methods can then be called with instance syntax on any object that is convertible(see convertibility section for details) to the first param of the method.

Validation  Extension methods are defined in C# as

static class Extensions

{

  public static IEnumerable<T> Where<T>(this IEnumerable<T> sequence, Predicate<T> predicate)

  {

     foreach (T item in sequence)

     {

        if (predicate(item))

        {

            yield return item;

        }

      }

   }

}

Several Interesting things to Note here

  1. The method is define in a top level static class ( the class is directly under the namespace)
  2. The method is static and decorates its first param with a new param modifier this, this param is called the "instance parameter" and its an error to use the "this" modifiers on any other parameter.
  3. No other parameter modifiers ( ref, out etc) are allowed with "this" (so values types can't be passed by reference to an extension methods, VB will allow ref).
  4. The instance parameter can't be a pointer type.
  5. The method is public (its accessible to anyone who can get to its parentclass).
  6. The Type parameter used must be defined on the method and not the parentclass.

5 & 6 are required because extension methods are bound as instance methods and the argument to instance parameter is the object on which the method is called. This means that the class in which the extension methods are defined is just a container to put the extension method in and will never be involved in binding it as an extension method e.g.

static class Extensions

{

   public static float Average(this System.Array array)

   {

       float average = 0;

       for (int I = 0; i < array.Length; i++)

       {

           average += (int)array.GetValue(i);

       }

       return average / array.Length;

   }

}

is bound as

int[] array = {34, 56, 100, 45, 23, 12};

array.Average(); note  Extensions has nothing to do with the call

 

      7.  The instance parameter cannot have the type of the Type parameter.

       As this would make the impossible to bind the method as an instance method  e.g.

public static class Extension3

{

    public static class Extension4

    {

        public static void Foo<T>(this T inst) { }

    }

}

Convertibility

The following conversion are defined on instance parameter on Extension methods

  1. Identity conversion (type S is S)
  2. Implicit reference conversions
  3. Boxing Conversions

Finding Extension methods

So how is the compiler to know which extension method to bind? The compiler looks for extension methods in the innermost namespace when the call is made for extension methods and then in all the namespaces imported by the "using" clause. This process is followed moving outward until we reach the topmost namespace.

Since extension methods can be imported in to the current context by the "using" clause and bound to any object which is assignable(see convertibility section for details) to the instance parameter, all sorts of interesting possibilities open up for extending the methods implemented by a type. This can simply be done by importing a library of extension methods and using these methods as if they were declared on a type that you don't own. This means that

1. Depending on the library you import the code can be made to do different things.

2. The client gets an interesting way to extend a type that he does not own.

Delegates

Calling methods is all well and good, but do extension methods work when instantiating a delegates?

Of course they do, (here is a good use of curried delegates). If you are not sure what curried delegates ( yum yum Indian food ) are and how to the compiler uses them. Don't be alarmed, i will be covering that in the next post of Extension method binding. 

e.g. using the Where extension defined above

double[] array = {34,45,67.12,95,25,69};

Func<Func<double, bool>, IEnumerable<double>> fun = array.Where<double>;

fun(new Func<double, bool>(Pred));

 

public static bool Pred(double arg)

{

   if (arg > 10)

       return true;

   return false;

}

This works just as long as the instance param is a reference type :)  (No Curry for you, Value Type)  

Extension methods can also be called on Delegate e.g.

public delegate T del1<T>(T val);

 

And Extension method  Exec defined as

static class Extension

 {

     public static T Exec<T>(this del1<T> source, T param)

     {

         return source(param);

     }

}

 

We can write client code like 

 

public class Test
{
    public static void Main()
    {
           Test t = new Test();
           del1<int> d1 = t.func<int>;
           d1.Exec(100);
     }

    public T func<T>(T val)
    {
           return val;
     }

}

 

This ends the first post on extension methods, next how exactly extension methods are bound.

kick it on DotNetKicks.com

 

Posted by Sree_c | 18 Comments
Filed under: ,

Hello Web

Who am I?

My name is Sreekar Choudhary and Welcome to my blog, I am a Dev on the C# compiler Team. I work most of the time on language features and implementing debugging framework for C# developers inside VS.

 

What's this blog about?

Well it’s about the cool features of that I get to design and work on and any other relavent technical topic that is of interest to me or the readers.

 

Posted by Sree_c | 7 Comments
More Posts Next page »
 
Page view tracker