I know the answer (it's 42)

A blog on coding, .NET, .NET Compact Framework and life in general....

July, 2008

Posts
  • I know the answer (it's 42)

    Auto generating Code Review Email for TFS

    • 8 Comments
    Hyderabad Microsoft Campus

    We use a small command line tool called crmail to auto-generate code review email from shelveset. I find the whole process very helpful and thought I'd share the process and the tool (which has some really cool features).

    Features

    1. Automatic generation of the email from the shelveset details
    2. Hyperlinks are put to TFS webaccess so that you can review code from machines without any tools installed, even without source enlistment. Yes it's true!!! The only thing you need is your office's intranet access
    3. You can even use a Windows mobile phone :) and even some non MS browsers. Ok I guess I have sold this enuf
    4. This is how the email looks like with all the details pointed out
      crmail
    5. Effectively you can see the file diff, history, blame (annotate), shelveset details, associated bugs, everything from your browser and best thing is that all of these takes one click each.
      This is how the fill diff looks in the browser
      webdiff

    Pre-reqs

    1. Team System Web Access (TSWA) 2008 power tool installed on your TFS server. For the shelveset link to work you'd need TSWA SP1 CTP. The other features work with the base TSWA 2008 install...
    2. Outlook installed on the machine on which the email is generated
    3. Enlistment and TFS client installed on the machine on which the email is generated
    4. For reviewers there is no pre-req other than a browser and email reader.

    Dev process

    1. The developer creates a shelveset after he is done with his changes. He ensures he fills up all the details including the reviewers email address ; separated
    2. He runs the tool with a simple command
      crmail shelvesetname
    3. Email gets generated and opened he fills in additional information and fires send
    4. Done!!

    Reviewers

    Ok they just click on the email links. Since mostly these are managers what more do you expect out of them? Real devs will stick with firing up tfpt command line :)

    Configuring the tool

    1. Download the binaries from here
    2. Unzip. Open the crmail.exe.config file and modify the values in it to point to your tfsserver and your code review distribution list (if you do not have one then make it empty)
    3. Checkin to some tools folder in your source control so that everyone in your team has access to it

    Support

    Self help is the best help :), download the sources from here and enjoy. Buck Hodges post on the direct link URLs would help you in case you want to modify the sources to do more.

  • I know the answer (it's 42)

    Writing exception handlers as separate methods may prove to be a good idea

    • 4 Comments
    Flowers At the Botanical park

    Let us consider a scenario where you catch some exception and in the exception handler do some costly operation. You can write that code in either of the following ways

    Method-1 : Separate method call

    public class Program
    {
        public static void Main(string[] args)
        {
            try
            {
                using (DataStore ds = new DataStore())
                {
                    // ...
                }
            }
            catch (Exception ex)
            {
                ErrorReporter(ex);
            }
        }
    
        private static void ErrorReporter(Exception ex)
        {
            string path = System.IO.Path.GetTempFileName();
            ErrorDumper ed = new ErrorDumper(path, ex);
            ed.WriteError();
    
            XmlDocument xmlDoc = new XmlDocument();
            xmlDoc.Load(path);
            RemoteErrorReporter er = new RemoteErrorReporter(xmlDoc);
            er.ReportError();
        }
    }

    -

    Method-2 : Inline

    public static void Main(string[] args)
    {
        try
        {
            using (DataStore ds = new DataStore())
            {
                // ...
            }
        }
        catch (Exception ex)
        {
            string path = System.IO.Path.GetTempFileName();
            ErrorDumper ed = new ErrorDumper(path, ex);
            ed.WriteError();
    
            XmlDocument xmlDoc = new XmlDocument();
            xmlDoc.Load(path);
            RemoteErrorReporter er = new RemoteErrorReporter(xmlDoc);
            er.ReportError();
        }
    }

    -

    The simple difference is that in the first case the exception handler is written as a separate method and in the second case it is placed directly inline inside the handler itself.

    The question is which is better in terms of performance?

    In case you do have significant code and type reference in the handler and you expect the exception to be thrown rarely in an application execution then the Method-1 is going to be more performant.

    The reason is that just before executing a method the whole method gets jitted. The jitted code contains stubs to the other method's it will call but it doesn't do a recursive jitting. This means when Main gets called it gets jitted but the method ErrorReporter is still not jitted. So in case the exception is never fired all the code inside ErrorReporter never gets Jitted. This might prove to be significant saving in terms of time and space if the handling code is complex and refers to type not already referenced.

    However, if the code is inline then the moment Main gets jitted all the code inside the catch block gets jitted. This is expensive not only because it leads to Jitting of code that is never executed but also because all types referenced in the catch block is also resolved resulting in loading a bunch of dlls after searching though the disk. In our example above System.Xml.dll and the other dll containing remote error reporting gets loaded even though they will never be used. Since disk access, loading assemblies and type resolution are slow, the simple change can prove to give some saving.

  • I know the answer (it's 42)

    How does the .NET CF handle null reference

    • 3 Comments
    Hyderabad Microsoft Campus

    What happens when we have code as bellow

    class B
    {
        public virtual void Virt(){
            Console.WriteLine("Base::Virt");
        }
    }
    
    class Program
    {
        static void Main(string[] args){
            B b = null;
            b.Virt(); // throws System.NullReferenceException
        }
    }

    Obviously we have a null reference exception being thrown. If you see the IL the call looks like

        L_0000: nop 
        L_0001: ldnull 
        L_0002: stloc.0 
        L_0003: ldloc.0 
        L_0004: callvirt instance void ConsoleApplication1.B::Virt()
        L_0009: nop 
        L_000a: ret 

    So in effect you'd expect the jitter to generate the following kind of code (in processor instruction)

    if (b == null)
       throw new NullReferenceException
    else
       b->Virt() // actually call safely using the this pointer

    However, generating null checks for every call is going to lead to code bloat. So to work around this on some platforms (e.g. .NETCF on WinCE 6.0 and above) it uses the following approach

    1. Hook up native access violation exception (WinCE 6.0 supports this) to a method in the execution engine (EE)
    2. Do not generate any null checking and directly generate calls through references
    3. In case the reference is null then a native AV (access violation is raised as invalid 0 address is accessed) and the hook method is called
    4. At this point the EE checks to see if the source of the access violation (native code) is inside Jitted code block. If yes it creates the managed NullRefenceException and propagates it up the call chain.
    5. If it's outside then obviously it's either CLR itself or some other native component is crashing and it has nothing to do about it..
  • I know the answer (it's 42)

    C# generates virtual calls to non-virtual methods as well

    • 2 Comments
    Hyderabad Microsoft Campus

    Sometime back I had posted about a case where non-virtual calls are used for virtual methods and promised posting about the reverse scenario. This issue of C# generating callvirt IL instruction even for non-virtual method calls keeps coming back on C# discussion DLs every couple of months. So here it goes :)

    Consider the following code

    class B
    {
        public virtual void Virt(){
            Console.WriteLine("Base::Virt");
        }
    
        public void Stat(){
            Console.WriteLine("Base::Stat");
        }
    }
    
    class D : B
    {
        public override void Virt(){
            Console.WriteLine("Derived::Virt");
        }
    }
    
    class Program
    {
        static void Main(string[] args)
        {
            D d = new D();
            d.Stat(); // should emit the call IL instruction
            d.Virt(); // should emit the callvirt IL instruction
        }
    }

    The basic scenario is that a base class defines a virtual method and a non-virtual method. A call is made to base using a derived class pointer. The expectation is that the call to the virtual method (B.Virt) will be through the intermediate language (IL) callvirt instruction and that to the non-virtual method (B.Stat) through call IL instruction.

    However, this is not true and callvirt is used for both. If we open the disassembly for the Main method using reflector or ILDASM this is what we see

        L_0000: nop 
        L_0001: newobj instance void ConsoleApplication1.D::.ctor()
        L_0006: stloc.0 
        L_0007: ldloc.0 
        L_0008: callvirt instance void ConsoleApplication1.B::Stat()
        L_000d: nop 
        L_000e: ldloc.0 
        L_000f: callvirt instance void ConsoleApplication1.B::Virt()
        L_0014: nop 
        L_0015: ret 

    Question is why? There are two reasons that have been brought forward by the CLR team

    1. API change.
      The reason is that .NET team wanted a change in an method (API) from non-virtual to virtual to be non-breaking. So in effect since the call is anyway generated as callvirt a caller need not be recompiled in case the callee changes to be a virtual method.
    2. Null checking
      If a call is generated and the method body doesn't access any instance variable then it is possible to even call on null objects successfully. This is currently possible in C++, see a post I made on this here.
    3. With callvirt there's a forced access to this pointer and hence the object on which the method is being called is automatically checked for null.

    callvirt does come with additional performance cost but measurement showed that there's no significant performance difference between call with null check vs callvirt. Moreover, since the Jitter has full metadata of the callee, while jitting the callvirt it can generate processor instructions to do static call if it figures out that the callee is indeed non-virtual.

    However, the compiler does try to optimize situations where it knows for sure that the target object cannot be null. E.g. for the expression i.ToString(); where i is an int call is used to call the ToString method because Int32 is value type (cannot be null) and sealed.

  • I know the answer (it's 42)

    Microsoft Roundtable

    • 2 Comments

    Our conference rooms have been fitted with this really weird looking device (click to enlarge).

    I had no clue what the thing was. Fortunately it's box was still placed in the room along with the manual. It's called the Microsoft RoundTable and it is actually a 360-degree camera (with 5 cameras and 6 microphones). It comes with bundled software that let's all participant be visible to the other side in a live meeting at real time. It shows the white board and the software is intelligent enough to focus on and track the active speaker (using microphone and face recognition) and much much more (lot of MS Research stuff has gone into it). The video below gives you some idea and head on to this post for some review and inside view of the device.

    Simply put it's AWSOME

  • I know the answer (it's 42)

    String equality

    • 1 Comments
    Flowers At the Botanical park

    akutz has one of the most detailed post on string interning and equality comparison performance metrics I have ever seen. Head over to the post here

    I loved his conclusion which is the crux of the whole story.

    "In conclusion, the String class’s static Equals method is the most efficient way to compare two string literals and the String class’s instance Equals method is the most efficient way to compare two runtime strings. The kicker is that there must be 10^5 (100,000) string comparisons taking place before one method starts becoming more efficient than the next. Until then, use whichever method floats your boat."

  • I know the answer (it's 42)

    Do namespace using directives affect Assembly Loading?

    • 0 Comments
    Hyderabad Microsoft Campus

    The simple answer is no, the inquisitive reader can read on :)

    Close to 2 year back I had posted about the two styles of coding using directives as follows

    Style 1

    namespace MyNameSpace
    {
        using System;
        using System.Collections.Generic;
        using System.Text;
        // ...
    }

    -

    Style 2

    using System;
    using System.Collections.Generic;
    using System.Text;
    namespace MyNameSpace { // ... }

    -

    and outlined the benefits of the first style (using directives inside the namespace). This post is not to re-iterate them.

    This post to figure out if either of the styles have any bearing in the loading order of assemblies. Obviously at the first look it clearly indicates that is shouldn't, but this has caused some back and forth discussions over the web.

    Scot Hanselman posted about a statement on the Microsoft Style cop blog which states

    "When using directives are declared outside of a namespace, the .Net Framework will load all assemblies referenced by these using statements at the same time that the referencing assembly is loaded.

    However, placing the using statements within a namespace element allows the framework to lazy load the referenced assemblies at runtime. In some cases, if the referencing code is not actually executed, the framework can avoid having to load one or more of the referenced assemblies completely. This follows general best practice rule about lazy loading for performance.

    Note, this is subject to change as the .Net Framework evolves, and there are subtle differences between the various versions of the framework."

    This just doesn't sound right because using directives have no bearing to assembly loading.

    Hanselman did a simple experiment with the following code

    using System;  
    using System.Xml;  
      
    namespace Microsoft.Sample  
    {  
       public class Program  
       {  
          public static void Main(string[] args)  
          {  
             Guid g = Guid.NewGuid();  
             Console.WriteLine("Before XML usage");  
             Console.ReadLine();  
             Foo();  
             Console.WriteLine("After XML usage");  
             Console.ReadLine();  
          }  
      
          public static void Foo()  
          {  
             XmlDocument x = new XmlDocument();  
          }  
       }  
    }  

    -

    and then he watched the loading time using process explorer and then he moved the using inside the namespace and did the same. Both loaded the System.Xml.dll after he hit enter on the console clearly indicating that for both the cases they got lazy loaded.

    Let me try to give a step by step rundown of how the whole type look up of XmlDocument happens in .NETCF which in turn would throw light on whether using directives have bearing on assembly loading.

    1. When Main method is Jitted and ran the System.Xml.dll is not yet loaded
    2. When method Foo is called the execution engine (referred to as EE) tries to JIT the method. As documented the Jitter only JITs methods that are to be executed.
    3. The Jitter tries to see if the method Foo is managed (could be native as well due to mixed mode support) and then tries to see if it's already Jitted (by a previous call), since it's not it goes ahead with jitting it
    4. The jitter validates a bunch of stuff like whether the class on which the method Foo is being called (in this case Microsoft.Sample.Program) is valid, been initialized, stack requirements, etc...
    5. Then it tries to resolve the local variables of the method. It waits to resolve the local variable type reference till this point so that it is able to save time and memory by not Jitting/loading types that are referenced by methods that are never executed
    6. Then it tries to resolve the type of the variable which in this case if System.Xml.XmlDocument.
    7. It sees if it's already in the cache, that is if that type is already loaded
    8. Since it's not the case it tries to search for the reference based on the type reference information
    9. This information contains the full type reference including the assembly name, which in this case is System.Xml.dll and also version information,strong name information, etc...
    10. All of the above information along with other information like the executing application's path is passed to the assembly loader to load the assembly
    11. The usual assembly search sequence is used to look for the assembly and then it is loaded and the type reference subsequently gets resolved

    If you see the above steps there is in no way a dependency of assembly loading on using directive. Hence at least on .NETCF whether you put the using outside or inside the namespace you'd get the referenced assemblies loaded exactly at the time of first reference of a type from that assembly (the step #5 above is the key).

Page 1 of 1 (7 items)