Welcome to MSDN Blogs Sign in | Join | Help

String equality

Flowers At the Botanical park

akutz has one of the most detailed post on string interning and equality comparison performance metrics I have ever seen. Head over to the post here

I loved his conclusion which is the crux of the whole story.

"In conclusion, the String class’s static Equals method is the most efficient way to compare two string literals and the String class’s instance Equals method is the most efficient way to compare two runtime strings. The kicker is that there must be 10^5 (100,000) string comparisons taking place before one method starts becoming more efficient than the next. Until then, use whichever method floats your boat."

Posted by abhinaba | 1 Comments

Writing exception handlers as separate methods may prove to be a good idea

Flowers At the Botanical park

Let us consider a scenario where you catch some exception and in the exception handler do some costly operation. You can write that code in either of the following ways

Method-1 : Separate method call

public class Program
{
    public static void Main(string[] args)
    {
        try
        {
            using (DataStore ds = new DataStore())
            {
                // ...
            }
        }
        catch (Exception ex)
        {
            ErrorReporter(ex);
        }
    }

    private static void ErrorReporter(Exception ex)
    {
        string path = System.IO.Path.GetTempFileName();
        ErrorDumper ed = new ErrorDumper(path, ex);
        ed.WriteError();

        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.Load(path);
        RemoteErrorReporter er = new RemoteErrorReporter(xmlDoc);
        er.ReportError();
    }
}

-

Method-2 : Inline

public static void Main(string[] args)
{
    try
    {
        using (DataStore ds = new DataStore())
        {
            // ...
        }
    }
    catch (Exception ex)
    {
        string path = System.IO.Path.GetTempFileName();
        ErrorDumper ed = new ErrorDumper(path, ex);
        ed.WriteError();

        XmlDocument xmlDoc = new XmlDocument();
        xmlDoc.Load(path);
        RemoteErrorReporter er = new RemoteErrorReporter(xmlDoc);
        er.ReportError();
    }
}

-

The simple difference is that in the first case the exception handler is written as a separate method and in the second case it is placed directly inline inside the handler itself.

The question is which is better in terms of performance?

In case you do have significant code and type reference in the handler and you expect the exception to be thrown rarely in an application execution then the Method-1 is going to be more performant.

The reason is that just before executing a method the whole method gets jitted. The jitted code contains stubs to the other method's it will call but it doesn't do a recursive jitting. This means when Main gets called it gets jitted but the method ErrorReporter is still not jitted. So in case the exception is never fired all the code inside ErrorReporter never gets Jitted. This might prove to be significant saving in terms of time and space if the handling code is complex and refers to type not already referenced.

However, if the code is inline then the moment Main gets jitted all the code inside the catch block gets jitted. This is expensive not only because it leads to Jitting of code that is never executed but also because all types referenced in the catch block is also resolved resulting in loading a bunch of dlls after searching though the disk. In our example above System.Xml.dll and the other dll containing remote error reporting gets loaded even though they will never be used. Since disk access, loading assemblies and type resolution are slow, the simple change can prove to give some saving.

Microsoft Roundtable

Our conference rooms have been fitted with this really weird looking device (click to enlarge).

I had no clue what the thing was. Fortunately it's box was still placed in the room along with the manual. It's called the Microsoft RoundTable and it is actually a 360-degree camera (with 5 cameras and 6 microphones). It comes with bundled software that let's all participant be visible to the other side in a live meeting at real time. It shows the white board and the software is intelligent enough to focus on and track the active speaker (using microphone and face recognition) and much much more (lot of MS Research stuff has gone into it). The video below gives you some idea and head on to this post for some review and inside view of the device.

Simply put it's AWSOME

Posted by abhinaba | 2 Comments
Filed under: , ,

Do namespace using directives affect Assembly Loading?

Hyderabad Microsoft Campus

The simple answer is no, the inquisitive reader can read on :)

Close to 2 year back I had posted about the two styles of coding using directives as follows

Style 1

namespace MyNameSpace
{
    using System;
    using System.Collections.Generic;
    using System.Text;
    // ...
}

-

Style 2

using System;
using System.Collections.Generic;
using System.Text;
namespace MyNameSpace { // ... }

-

and outlined the benefits of the first style (using directives inside the namespace). This post is not to re-iterate them.

This post to figure out if either of the styles have any bearing in the loading order of assemblies. Obviously at the first look it clearly indicates that is shouldn't, but this has caused some back and forth discussions over the web.

Scot Hanselman posted about a statement on the Microsoft Style cop blog which states

"When using directives are declared outside of a namespace, the .Net Framework will load all assemblies referenced by these using statements at the same time that the referencing assembly is loaded.

However, placing the using statements within a namespace element allows the framework to lazy load the referenced assemblies at runtime. In some cases, if the referencing code is not actually executed, the framework can avoid having to load one or more of the referenced assemblies completely. This follows general best practice rule about lazy loading for performance.

Note, this is subject to change as the .Net Framework evolves, and there are subtle differences between the various versions of the framework."

This just doesn't sound right because using directives have no bearing to assembly loading.

Hanselman did a simple experiment with the following code

using System;  
using System.Xml;  
  
namespace Microsoft.Sample  
{  
   public class Program  
   {  
      public static void Main(string[] args)  
      {  
         Guid g = Guid.NewGuid();  
         Console.WriteLine("Before XML usage");  
         Console.ReadLine();  
         Foo();  
         Console.WriteLine("After XML usage");  
         Console.ReadLine();  
      }  
  
      public static void Foo()  
      {  
         XmlDocument x = new XmlDocument();  
      }  
   }  
}  

-

and then he watched the loading time using process explorer and then he moved the using inside the namespace and did the same. Both loaded the System.Xml.dll after he hit enter on the console clearly indicating that for both the cases they got lazy loaded.

Let me try to give a step by step rundown of how the whole type look up of XmlDocument happens in .NETCF which in turn would throw light on whether using directives have bearing on assembly loading.

  1. When Main method is Jitted and ran the System.Xml.dll is not yet loaded
  2. When method Foo is called the execution engine (referred to as EE) tries to JIT the method. As documented the Jitter only JITs methods that are to be executed.
  3. The Jitter tries to see if the method Foo is managed (could be native as well due to mixed mode support) and then tries to see if it's already Jitted (by a previous call), since it's not it goes ahead with jitting it
  4. The jitter validates a bunch of stuff like whether the class on which the method Foo is being called (in this case Microsoft.Sample.Program) is valid, been initialized, stack requirements, etc...
  5. Then it tries to resolve the local variables of the method. It waits to resolve the local variable type reference till this point so that it is able to save time and memory by not Jitting/loading types that are referenced by methods that are never executed
  6. Then it tries to resolve the type of the variable which in this case if System.Xml.XmlDocument.
  7. It sees if it's already in the cache, that is if that type is already loaded
  8. Since it's not the case it tries to search for the reference based on the type reference information
  9. This information contains the full type reference including the assembly name, which in this case is System.Xml.dll and also version information,strong name information, etc...
  10. All of the above information along with other information like the executing application's path is passed to the assembly loader to load the assembly
  11. The usual assembly search sequence is used to look for the assembly and then it is loaded and the type reference subsequently gets resolved

If you see the above steps there is in no way a dependency of assembly loading on using directive. Hence at least on .NETCF whether you put the using outside or inside the namespace you'd get the referenced assemblies loaded exactly at the time of first reference of a type from that assembly (the step #5 above is the key).

Auto generating Code Review Email for TFS

Hyderabad Microsoft Campus

We use a small command line tool called crmail to auto-generate code review email from shelveset. I find the whole process very helpful and thought I'd share the process and the tool (which has some really cool features).

Features

  1. Automatic generation of the email from the shelveset details
  2. Hyperlinks are put to TFS webaccess so that you can review code from machines without any tools installed, even without source enlistment. Yes it's true!!! The only thing you need is your office's intranet access
  3. You can even use a Windows mobile phone :) and even some non MS browsers. Ok I guess I have sold this enuf
  4. This is how the email looks like with all the details pointed out
    crmail
  5. Effectively you can see the file diff, history, blame (annotate), shelveset details, associated bugs, everything from your browser and best thing is that all of these takes one click each.
    This is how the fill diff looks in the browser
    webdiff

Pre-reqs

  1. Team System Web Access (TSWA) 2008 power tool installed on your TFS server. For the shelveset link to work you'd need TSWA SP1 CTP. The other features work with the base TSWA 2008 install...
  2. Outlook installed on the machine on which the email is generated
  3. Enlistment and TFS client installed on the machine on which the email is generated
  4. For reviewers there is no pre-req other than a browser and email reader.

Dev process

  1. The developer creates a shelveset after he is done with his changes. He ensures he fills up all the details including the reviewers email address ; separated
  2. He runs the tool with a simple command
    crmail shelvesetname
  3. Email gets generated and opened he fills in additional information and fires send
  4. Done!!

Reviewers

Ok they just click on the email links. Since mostly these are managers what more do you expect out of them? Real devs will stick with firing up tfpt command line :)

Configuring the tool

  1. Download the binaries from here
  2. Unzip. Open the crmail.exe.config file and modify the values in it to point to your tfsserver and your code review distribution list (if you do not have one then make it empty)
  3. Checkin to some tools folder in your source control so that everyone in your team has access to it

Support

Self help is the best help :), download the sources from here and enjoy. Buck Hodges post on the direct link URLs would help you in case you want to modify the sources to do more.

How does the .NET CF handle null reference

Hyderabad Microsoft Campus

What happens when we have code as bellow

class B
{
    public virtual void Virt(){
        Console.WriteLine("Base::Virt");
    }
}

class Program
{
    static void Main(string[] args){
        B b = null;
        b.Virt(); // throws System.NullReferenceException
    }
}

Obviously we have a null reference exception being thrown. If you see the IL the call looks like

    L_0000: nop 
    L_0001: ldnull 
    L_0002: stloc.0 
    L_0003: ldloc.0 
    L_0004: callvirt instance void ConsoleApplication1.B::Virt()
    L_0009: nop 
    L_000a: ret 

So in effect you'd expect the jitter to generate the following kind of code (in processor instruction)

if (b == null)
   throw new NullReferenceException
else
   b->Virt() // actually call safely using the this pointer

However, generating null checks for every call is going to lead to code bloat. So to work around this on some platforms (e.g. .NETCF on WinCE 6.0 and above) it uses the following approach

  1. Hook up native access violation exception (WinCE 6.0 supports this) to a method in the execution engine (EE)
  2. Do not generate any null checking and directly generate calls through references
  3. In case the reference is null then a native AV (access violation is raised as invalid 0 address is accessed) and the hook method is called
  4. At this point the EE checks to see if the source of the access violation (native code) is inside Jitted code block. If yes it creates the managed NullRefenceException and propagates it up the call chain.
  5. If it's outside then obviously it's either CLR itself or some other native component is crashing and it has nothing to do about it..
Posted by abhinaba | 3 Comments
Filed under: ,

C# generates virtual calls to non-virtual methods as well

Hyderabad Microsoft Campus

Sometime back I had posted about a case where non-virtual calls are used for virtual methods and promised posting about the reverse scenario. This issue of C# generating callvirt IL instruction even for non-virtual method calls keeps coming back on C# discussion DLs every couple of months. So here it goes :)

Consider the following code

class B
{
    public virtual void Virt(){
        Console.WriteLine("Base::Virt");
    }

    public void Stat(){
        Console.WriteLine("Base::Stat");
    }
}

class D : B
{
    public override void Virt(){
        Console.WriteLine("Derived::Virt");
    }
}

class Program
{
    static void Main(string[] args)
    {
        D d = new D();
        d.Stat(); // should emit the call IL instruction
        d.Virt(); // should emit the callvirt IL instruction
    }
}

The basic scenario is that a base class defines a virtual method and a non-virtual method. A call is made to base using a derived class pointer. The expectation is that the call to the virtual method (B.Virt) will be through the intermediate language (IL) callvirt instruction and that to the non-virtual method (B.Stat) through call IL instruction.

However, this is not true and callvirt is used for both. If we open the disassembly for the Main method using reflector or ILDASM this is what we see

    L_0000: nop 
    L_0001: newobj instance void ConsoleApplication1.D::.ctor()
    L_0006: stloc.0 
    L_0007: ldloc.0 
    L_0008: callvirt instance void ConsoleApplication1.B::Stat()
    L_000d: nop 
    L_000e: ldloc.0 
    L_000f: callvirt instance void ConsoleApplication1.B::Virt()
    L_0014: nop 
    L_0015: ret 

Question is why? There are two reasons that have been brought forward by the CLR team

  1. API change.
    The reason is that .NET team wanted a change in an method (API) from non-virtual to virtual to be non-breaking. So in effect since the call is anyway generated as callvirt a caller need not be recompiled in case the callee changes to be a virtual method.
  2. Null checking
    If a call is generated and the method body doesn't access any instance variable then it is possible to even call on null objects successfully. This is currently possible in C++, see a post I made on this here.
  3. With callvirt there's a forced access to this pointer and hence the object on which the method is being called is automatically checked for null.

callvirt does come with additional performance cost but measurement showed that there's no significant performance difference between call with null check vs callvirt. Moreover, since the Jitter has full metadata of the callee, while jitting the callvirt it can generate processor instructions to do static call if it figures out that the callee is indeed non-virtual.

However, the compiler does try to optimize situations where it knows for sure that the target object cannot be null. E.g. for the expression i.ToString(); where i is an int call is used to call the ToString method because Int32 is value type (cannot be null) and sealed.

Posted by abhinaba | 2 Comments

Guy or a Girl

Hyderabad Microsoft Campus

One interesting aspect of working in Internationally distributed team is that sometime it gets difficult to make common judgements. E.g. when we see a name we inherently figure out whether it's a male or female name and refer to that person as such in email. The issue is that I cannot always make the same judgement in case of names from another country/culture.

In my previous team in a long email thread someone continually referred to Khushboo as "he". Khushboo didn't correct him and it went on for some time until I pointed out that to him in a separate email. Today I was typing an email to someone and suddenly figured out I had no idea whether one of the person I'm referring to is male or female. I took a wild guess and I'm waiting to get corrected.

Posted by abhinaba | 5 Comments
Filed under:

Baby smash

Waiting in the Microsoft lobby

What is the common thing between every programmer dad/mom? The moment they get onto a new UI platform they write a child proofing application for the keyboard.

Scott Hanselman has just posted his version baby smash (via AmitChat). The funny thing is I've written one in WPF and so did my ex-manager.

Posted by abhinaba | 1 Comments

Cell phone assault

Visakhapatnam - Ramakrishna beach

Last two weeks my cell phone got assaulted thrice. First it was someone sending me a virus over bluetooth (a sis file actually). This happened when I was taking a photograph of my daughter with the cell phone camera in a restaurant (Aromas of China, City Center mall in Hyderabad).

The next one was bluetooth based advertisement messages in the Forum Mall in Bangalore. They were actually sending offers of the hour over bluetooth and I got 2 such messages.

The third incident was in the airport when someone was again trying to send me and make me open an trojan app.

I was really surprised with the rapid growth of cell phone based attacks. Worst is few people know of this. My wife had no idea that you can actually send applications over bluetooth and that can infect the phone.

Stylecop has been released

me

Microsoft released the internal tool StyleCop to public under the fancy yet boring name of Microsoft Source Analysis for C#. Even though the name is boring the product is not.

You'll love this tool when it imposes consistent coding style across your team. You'll hate this tool when it imposes the same on you. The result is stunning looking, consistently styled code which your whole team can follow uniformly.

StyleCop has been in use for a long time internally in Microsoft and many teams mandate it's usage. My previous team VSTT used it as well. The only crib I had is that it didn't allow single line getters and setters (and our team didn't agree to disable this rule either).

// StyleCop didn't like this one
public int Foo
{
    get { return Foo; }
}

// StyleCop wanted this instead
public int Foo
{
    get
    {
        return Foo;
    }
}

Read more about using StyleCop here. You can set this up to be run as a part of your build process as documented here. Since this is plugged in as a MsBuild project you can use it in as a part of Team Foundation Build process as well.

Let the style wars begin in team meetings :)

Update: Jason corrected me in the comment, apparently StyleCop indeed allows single line getters and setters (seems like they got it fixed since the last time I used it).

Posted by abhinaba | 1 Comments
Filed under: ,

Building Scriptable Applications by hosting JScript

The kind of food I should have, but I don't

If you have played around with large applications, I'm sure you have been intrigued how they have been build to be extendable. The are multiple options

  1. Develop your own extension mechanism where you pick up extension binaries and execute them.
    One managed code example is here, where the application loads dlls (assemblies) from a folder and runs specific types from them. A similar unmanaged approach is allow registration of guids and use COM to load types that implement those interfaces
  2. Roll out your own scripting mechanism:
    One managed example is here where on the fly compilation is used. With DLR hosting mechanism coming up this will be very easy going forward
  3. Support standard scripting mechanism:
    This involves hosting JScript/VBScript inside the application and exposing a document object model (DOM) to it. So anyone can just write standard JScript to extend the application very much like how JScript in a webpage can extend/program the HTML DOM.

Obviously the 3rd is the best choice if you are developing a native (unmanaged) solution. The advantages are many because of low learning curve (any JScript programmer can write extensions), built in security, low-cost.

In this post I'll try to cover how you go about doing exactly that. I found little online documentation and took help of Kaushik from the JScript team to hack up some code to do this.

The Host Interface

To host JScript you need to implement the IActiveScriptSite. The code below shows how we do that stripping out the details we do not want to discuss here (no fear :) all the code is present in the download pointed at the end of the post). The code below is in the file ashost.h

class IActiveScriptHost : public IUnknown 
{
public:
    // IUnknown
    virtual ULONG __stdcall AddRef(void) = 0;
    virtual ULONG __stdcall Release(void) = 0;
    virtual HRESULT __stdcall QueryInterface(REFIID iid,
void **obj) = 0; // IActiveScriptHost virtual HRESULT __stdcall Eval(const WCHAR *source,
VARIANT *result) = 0; virtual HRESULT __stdcall Inject(const WCHAR *name,
IUnknown *unkn) = 0;
}; class ScriptHost : public IActiveScriptHost, public IActiveScriptSite { private: LONG _ref; IActiveScript *_activeScript; IActiveScriptParse *_activeScriptParse; ScriptHost(...){} virtual ~ScriptHost(){} public: // IUnknown virtual ULONG __stdcall AddRef(void); virtual ULONG __stdcall Release(void); virtual HRESULT __stdcall QueryInterface(REFIID iid, void **obj); // IActiveScriptSite virtual HRESULT __stdcall GetLCID(LCID *lcid); virtual HRESULT __stdcall GetItemInfo(LPCOLESTR name, DWORD returnMask, IUnknown **item, ITypeInfo **typeInfo); virtual HRESULT __stdcall GetDocVersionString(BSTR *versionString); virtual HRESULT __stdcall OnScriptTerminate(const VARIANT *result, const EXCEPINFO *exceptionInfo); virtual HRESULT __stdcall OnStateChange(SCRIPTSTATE state); virtual HRESULT __stdcall OnEnterScript(void); virtual HRESULT __stdcall OnLeaveScript(void); virtual HRESULT __stdcall OnScriptError(IActiveScriptError *error); // IActiveScriptHost virtual HRESULT __stdcall Eval(const WCHAR *source,
VARIANT *result); virtual HRESULT __stdcall Inject(const WCHAR *name,
IUnknown *unkn);
public: static HRESULT Create(IActiveScriptHost **host) { ... } };

Here we are defining an interface IActiveScriptHost. ScriptHost implements the IActiveScriptHost and also the required hosting interface IActiveScriptSite. IActiveScriptHost exposes 2 extra methods (in green) that will be used from outside to easily host js scripts.

In addition ScriptHost also implements a factory method Create. This create method does the heavy lifting of using COM querying to get the various interfaces its needs (IActiveScript, IActiveScriptParse) and stores them inside the corresponding pointers.

Instantiating the host

So the client of this host class creates the ScriptHosting instance by using the following (see ScriptHostBase.cpp)

IActiveScriptHost *activeScriptHost = NULL;
HRESULT hr = S_OK;
HRESULT hrInit = S_OK;

hrInit = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
if(FAILED(hr)) throw L"Failed to initialize";

hr = ScriptHost::Create(&activeScriptHost);
if(FAILED(hr)) throw L"Failed to create ScriptHost";

With this the script host is available through activeScriptHost pointer and we already have JScript engine hosted in our application

Evaluating Scripts

Post hosting we need to make it do something interesting.This is where the IActiveScriptHost::Eval method comes in.

HRESULT __stdcall ScriptHost::Eval(const WCHAR *source, 
VARIANT *result) { assert(source != NULL); if (source == NULL) return E_POINTER; return _activeScriptParse->ParseScriptText(source, NULL,
NULL, NULL, 0, 1,
SCRIPTTEXT_ISEXPRESSION,
result, NULL); }

Eval accepts a text of the script, makes it execute using IActiveScriptParse::ParseScriptText and returns the result.

So effectively we can accept input from the console and evaluate it (or read a file and interpret the complete script in it.

while (true) 
{
    wcout << L">> ";
    getline(wcin, input);
    if (quitStr.compare(input) == 0) break;

    if (FAILED(activeScriptHost->Eval(input.c_str(), &result)))
{ throw L"Script Error"; } if (result.vt == 3) wcout << result.lVal << endl; }

So all this is fine and at the end you can run the app (which BTW is a console app) and this what you can do.

JScript sample Host
q! to quit

>> Hello = 7
7
>> World = 6
6
>> Hello * World
42
>> q!
Press any key to continue . . .

So you have extended your app to do maths for you or rather run basic scripts which even though exciting but is not of much value.

Extending your app

Once we are past hosting the engine and running scripts inside the application we need to go ahead with actually building the application's DOM and injecting it into the hosting engine so that JScript can extend it.

If you already have a native application which is build on COM (IDispatch) then you have nothing more to do. But lets pretend that we actually have nothing and need to build the DOM.

To build the DOM you need to create IDispatch based DOM tree. There can be more than one roots. In this post I'm not trying to cover how to build IDispatch based COM objects (which you'd do using ATL or some such other means). However, for simplicity we will roll out a hand written implementation which implements an interface as below.

class IDomRoot : public IDispatch 
{
    // IUnknown
    virtual ULONG __stdcall AddRef(void) = 0;
    virtual ULONG __stdcall Release(void) = 0;
    virtual HRESULT __stdcall QueryInterface(REFIID iid, 
void **obj) = 0; // IDispatch virtual HRESULT __stdcall GetTypeInfoCount( UINT *pctinfo) = 0; virtual HRESULT __stdcall GetTypeInfo( UINT iTInfo, LCID lcid, ITypeInfo **ppTInfo) = 0; virtual HRESULT __stdcall GetIDsOfNames( REFIID riid,
LPOLESTR *rgszNames, UINT cNames, LCID lcid,
DISPID *rgDispId) = 0;
virtual HRESULT __stdcall Invoke( DISPID dispIdMember, REFIID riid,
LCID lcid, WORD wFlags,
DISPPARAMS *pDispParams, VARIANT *pVarResult,
EXCEPINFO *pExcepInfo, UINT *puArgErr) = 0; // IDomRoot virtual HRESULT __stdcall Print(BSTR str) = 0; virtual HRESULT __stdcall get_Val(LONG* pVal) = 0; virtual HRESULT __stdcall put_Val(LONG pVal) = 0; };

At the top we have the standard IUnknown and IDispatch methods and at the end we have our DOM Root's methods (in blue). It implements a Print method that prints a string and a property called Val (with a set and get method for that property).

The class DomRoot implements this method and an additional method named Create which is the factory to create it. Once we are done with creating this we will inject this object inside the JScript scripting engine. So our final script host code looks as follows

IActiveScriptHost *activeScriptHost = NULL;
IDomRoot *domRoot = NULL;
HRESULT hr = S_OK;
HRESULT hrInit = S_OK;

hrInit = CoInitializeEx(NULL, COINIT_APARTMENTTHREADED);
if(FAILED(hr)) throw L"Failed to initialize";

// Create the host
hr = ScriptHost::Create(&activeScriptHost);
if(FAILED(hr)) throw L"Failed to create ScriptHost";

// create the DOM Root
hr = DomRoot::Create(&domRoot);
if(FAILED(hr)) throw L"Failed to create DomRoot";

// Inject the created DOM Root into the scripting engine
activeScriptHost->Inject(L"DomRoot", (IUnknown*)domRoot);

What happens with the inject is as below

map rootList;
typedef map::iterator MapIter;
typedef pair InjectPair;

HRESULT __stdcall ScriptHost::Inject(const WCHAR *name, 
IUnknown *unkn) { assert(name != NULL); if (name == NULL) return E_POINTER; _activeScript->AddNamedItem(name, SCRIPTITEM_GLOBALMEMBERS |
SCRIPTITEM_ISVISIBLE ); rootList.insert(InjectPair(std::wstring(name), unkn)); return S_OK; }

In inject we store the name of the object and the corresponding IUnknown in a map (hash table). Each time the script will encounter a object in its code it calls GetItemInfo with that objects name and we then de-reference into the hash table and return the corresponding IUnknown

HRESULT __stdcall ScriptHost::GetItemInfo(LPCOLESTR name,
                                    DWORD returnMask,
                                    IUnknown **item,
                                    ITypeInfo **typeInfo)
{	
    MapIter iter = rootList.find(name);
    if (iter != rootList.end())
    {
        *item = (*iter).second;
        return S_OK;
    }
    else
        return E_NOTIMPL;
}

After that the script calls into that IDispatch to look for properties and methods and calls into them.

The Whole Flow

By now we have seen a whole bunch of code. Let's see how the whole thing works together. Let's assume we have a extension written in in JScript and it calls DomRoot.Val = 5; this is what happens to get the whole thing to work

  1. During initialization we had created the DomRoot object (DomRoot::Create) which implements IDomRoot and injected it in the script engine via AddNamedItem and stored it at our end in a rootList map.
  2. We call activeScriptHost->Eval(L"DomRoot.Val = 5;", ...) to evaluate the script. Evan calls _activeScriptParse->ParseScriptText.
  3. When the script parse engine sees the "DomRoot" name it figures out that the name is a valid name added with AddNamedItem and hence it calls its hosts ScriptHost::GetItemInfo("DomRoot");
  4. The host we have written looks up the same map filled during Inject and returns the IUnknown of it to the scripting engine. So at this point the scripting engine has a handle to our DOM root via an IUnknown to the DomRoot object
  5. The scripting engine does a QueryInterface on that IUnknown to get the IDispatch interface from it
  6. Then the engine calls the IDispatch::GetIDsOfNames with the name of the property "Val"
  7. Our DomRoots implementation of GetIDsOfNames returns the required Dispatch ID of the Val property (which is 2 in our case)
  8. The script engine calls IDispatch::Invoke with that dispatch id and a flag telling whether it wants the get or the set. In this case its set. Based on this the DomRoot re-directs the call to DomRoot::put_Val
  9. With this we have a full flow of the host to script back to the DOM

In action

JScript sample Host
q! to quit

>> DomRoot.Val = 5;
5
>> DomRoot.Val = DomRoot.Val * 10
50
>> DomRoot.Val
50
>> DomRoot.Print("The answer is 42");
The answer is 42

 

Source Code

First of all the disclaimer. Let me get it off my chest by saying that the DomRoot code is a super simplified COM object. It commits nothing less than sacrilege. You shouldn't treat it as a sample code. I intentionally didn't do a full implementation so that you can step into it without the muck of IDispatchImpl or ATL coming into your way.

However, you can treat the script hosting part (ashost, ScriptHostBase) as sample code (that is the idea of the whole post :) )

The code organization is as follows

ashost.cpp, ashost.h - The Script host implementation
DomRoot.cpp, DomRoot.h - The DOM Root object injected into the scripting engine
ScriptHostBase.cpp - Driver

Note that in a real life example the driver should load jscript files from a given folder and execute it.

Download from here

Posted by abhinaba | 3 Comments
Filed under: ,

Model, View, Controller

Chocs

These days the whole world is abuzz with the Model, View, Controller (MVC) architecture. This is not something new and is known by computer scientists for close to 30 years. I guess the new found popularity is due to the fact that this has heavy application is web development and lot of main-stream web development platform are putting in support for this. Ruby-on-rails and ASP.NET MVC are classic examples.

Coding horror has a nice post on this topic. I liked the following statement it made

"Skinnability cuts to the very heart of the MVC pattern. If your app isn't "skinnable", that means you've probably gotten your model's chocolate in your view's peanut butter, quite by accident."

I actually use a very similar concept. The moment I see an application's architecture (be it an interview candidate or a friend showing off something) I ask the question "Can you write a console version of this easily?". If the answer is no or it needs a re-design it means that the separation of model, view and controller is not correct. You are going to have a nightmare if you write and maintain that software.

Posted by abhinaba | 1 Comments
Filed under: ,

You need to be careful about how you use count in for-loops

Negotiating

Lets consider the following code

 MyCollection myCol = new MyCollection();
 myCol.AddRange(new int[] { 1, 2, 3, 4, 5, });
 for (int i = 0; i < myCol.Count; ++i)
 {
     Console.WriteLine("{0}", i);
 }

What most people forgets to consider is the condition check that happens for the for loop (in Red). In this case it is actually a call to a method get_Count. The compiler doesn't optimize this away (even when inlined)!! Since the condition is evaluated after each iteration, there's actually a method call happening each time. For most .NET code it is low-cost because ultimately it's just a small method with count being returned from an instance variable (if inlined the method call overhead also goes away). Something like this is typically written for count.

public int Count
{
    get
    {  
        return this.count;
    }
}

Someone wrote something very similar in C, where in the for-loop he used strlen. Which means that the following code actually is O(n2) because each time you are re-calculating the length...

for(int i = 0; i < strlen(str); ++i) {...}

Why the compiler cannot optimize it is another question. For one it's a method call and it is difficult to predict whether there's going to be any side-effect. So optimizing the call away can have other disastrous effect.

So store the count and re-use it if you know that the count is not going to change...

Posted by abhinaba | 3 Comments

Alternatives to XML

Halloween

Though not as much as the Jeff Atwood I don't like overuse of XML as well. In our last project we used XML in a bunch of places where it made sense and also planned to use it in bunch of other places where it didn't. For some strange reason some folks think its actually readable and suggested we use XML to dump the user actions we recorded because it's easy to parse and is human readable/editable. While I'm perfectly fine doing it in XML, but definitely not for that reason.

Anyways, sense prevailed and even though we do store it in XML we dump out automation source code in obviously more readable C#/VB.NET.

Before I completely get sidetracked let me state that this post is not about XML or about JSON but about the fact  that there exists many alternatives to both. Head on to here (via Coding Horror)...

Posted by abhinaba | 0 Comments
Filed under: ,
More Posts Next page »
 
Page view tracker