SimpleScript Part Five: Named Items and Modules

SimpleScript Part Five: Named Items and Modules

  • Comments 16

Named Items

"Named items" are what we call the "top level" objects of the host provided object model.  WScript in WSH, window in Internet Explorer, Response in ASP, are all named items.  A host tells an initialized script engine about named items via the aptly named AddNamedItem method on the IActiveScript interface.

HRESULT ScriptEngine::AddNamedItem(const WCHAR * pszName, DWORD flags)

A few things should immediately seem a little weird about this interface. 

The First Four Flags

First off, what are the flags?  There are six.  The first four are pretty straightforward:

If SCRIPTITEM_ISVISIBLE is set then the name of the named item is added to the global namespace. 

Uh, OK… why would you ever NOT want this set?  Didn't I just say that named items were specifically for injecting named object model roots into the engine?  What good is a named item if you can't see its name?

That brings us to the second flag; if SCRIPTITEM_GLOBALMEMBERS is set then all (immediate) children of the named item are treated as though they are themselves top-level objects/methods.  That's how in Internet Explorer you can say

window.alert("hello");

or

alert("hello");

and they do the same thing.  IE tells the script engine that window is a visible named item with global members.

Now it makes a little more sense why you might want to have an invisible named item.  What if you had an object with lots of methods and properties that you wanted available in the global namespace, but the object itself didn’t have a sensible name?  I can't think of any script host offhand that does that, but the capability is there if you need it.

There's a second reason why you might want a named item to be invisible, but we'll get to that in a minute.

The third flag is SCRIPTITEM_ISSOURCE.  If that's set then we know that this object is an event source.  If the language supports implicit event binding and the host moves the engine into connected state, we're going to need to know which named items to hook up to which events.  It can be very expensive to do that hookup, so this provides a simple optimization.  If the host knows that a particular named item does not source events, it can choose to not mark the named item as a source, and we therefore never spend any time trying to hook up event sinks to it.

The fourth flag is SCRIPTITEM_ISPERSISTENT.  Recall that I said a while back that when the engine goes back to uninitialized state after being initialized, we throw away "some" named items, where "some" was to be defined later.  Now you know -- the engine remembers named items marked as persistent.  Information about those named items is not thrown away until the engine is closed.  Also, cloning an engine is basically making a copy of the uninitialized state of an engine, so persistent named items get cloned when their engine gets cloned.  As we'll see, this fact has implications for our implementation.

Modules

I'm sure you have a general idea of what I mean by a "module", though if you Google define:module you'll see that everyone has a slightly different definition.  Modules in the script engine sense are philosophically fairly straightforward.  I often want to have some way to say "this collection of functions can play with each other, but are isolated from this other collection of functions".  I want to be able to resolve name collisions by having two methods with the same name coexist in different modules. 

Well, I want that stuff in languages designed for "programming in the large".  When using script languages, more often than not you simply don't need to chunk stuff into modules.  But, bizarrely enough, the script engines support modules, and in a pretty goofy way at that.  Of course, any language implementor can implement module semantics however they want, but I see no reason to mess around with the de facto standards.  Here's how modules work in VBScript and JScript:

  • There is a "global" module.  Procedures defined in the global module are callable from any module.  This is where the "built in" methods in VBScript and JScript go.
  • Every named item is associated with a unique module, with some exceptions:
    • Named items with global members are associated with the global module.
    • Named items marked with SCRIPTITEM_NOCODE are not associated with any module
  • All visible named items are added to the global namespace, except for named items marked as SCRIPTITEM_CODEONLY. Those are just names of modules and are associated with no object.
  • Important distinction: though all visible named items are added to the global module's namespace, procedures in the named items' modules are not visible from the global module.

Let me try to make this a little more concrete.  Let's consider a hypothetical declarative language that supports embedded imperative script.  You might want to have something like this:

<application>
      <script>
            function toggle ( x )
              
if (x == false) return true else return false
      </script>
      <form name="fred">
            <checkbox name="chuck" checked="false" />
            <button name="bob">
                  <caption>Hello world!</caption>
                  <click>
                        fred.checked = toggle(fred.checked)
                  </click>
                  <script>
                        function foo() 
                        // button-specific code here

And so on.  Suppose the host defined fred as a visible named item with global members, bob and chuck as named items each with their own module.  Then code in bob's event handlers can access anything in the global application namespace and any member of fred.  But if we now add script to chuck, chuck's module cannot see bob's code.  chuck is free to define its own function foo, which will not collide.

What if bob wants to access code in chuck's module?  We'll see in a later entry how chuck can do that -- there are many subtle issues involved and we'll need more infrastructure built into SimpleScript before we can tackle it.

Where's the Object?

One more weird thing -- where's the object?  The method takes a name and some flags, but no object. 

Let me answer that question with another question: suppose a persistent named item's object is thread affinitized, and you clone the engine and initialize the clone on a different thread?  That thing can't use the object from the previous thread! 

To solve this problem, we defer getting the actual object until we need it.  The engine calls the site back and asks for the object.  I haven't implemented that code yet; I'll talk more about that next time.

The Implementation

I've added a named item list object to SimpleScript in nameditemlist.cpp but I haven't quite figured out how I want to implement modules yet. 

As you can see, I've rolled my own hash table rather than using hash_map.  There was quite a long series of comments the other day about the pros and cons.  I chose to roll my own rather than using an off-the-shelf solution for several reasons:

  • the implementation does not require a rocket-science hash table.  It's very unlikely that we're going to get a huge number of named items in a script engine.  Even very complex web pages seldom add more than a couple dozen named items.  (In IE, every button, form, etc, is a named item.)  Thus, I've implemented a fixed-number-of-buckets hashing-with-chaining table with a very simple string hashing algorithm.
  • I really have no clue how hash_map works, and I want to spend zero time debugging my way through acres of template code if there's a problem.
  • I realized that I was starting to succumb to Object Happiness as I was building a templatized hash table of my own.  I'm going to need two or three hash tables in this project all told.  And its not like templates actually make the generated code smaller or more efficient (though in C#, generics can -- but that's another story.)  I'm going to keep it simple and avoid getting Happy.
  • hash_map requires that new throw exceptions.  As I've mentioned before, that gives me the shakes.  I don't like mixing C++ exception handling into what is fundamentally a COM program.
  • I think hash_map is threadsafe for the applications I'm going to use it in.  I'm not sure.  That makes me nervous.  I am sure that my own code is going to be a lot more amenable to analysis with respect to the goofy script engine threading contract.
  • I ended up spending WAAAY more time investigating the template code than it took me to write the hash table from scratch.  I'm doing this in my spare time here folks... 

All those things totally outweigh the negligible benefits of re-using an existing generic map, no matter how bulletproof and performant it is.

That reminds me, I wanted to talk more about the script engine threading model.  My earlier post on the subject glossed over some important details.

My team hit our Whidbey Beta One Zero Bug Bounce yesterday, so I'm off for a long weekend far, far away from computers.  I'll cut it short right now; next week I'll pick it up again and talk a bit more about the threading model, the implications of engine cloning, and I'll kick around some ideas about how to implement modules and code blocks.

  • "Performant"???
  • Yes, performant. Perfectly good word.

    I've had that conversation already.

    http://weblogs.asp.net/ericlippert/archive/2003/10/14/53209.aspx#53212

  • BTW, as you are using COM here, there is actually another "hash table" implementation you could have used: a COM bind context (I put hash table in quotes because I don't know how the bind context is implemented internally). Simply create it using CreateBindCtx and then use IBindCtx::RegisterObjectParam and IBindCtx::RevokeObjectParam. The only downside I can think of is that you would need to implement IUnknown for the table items. But that's not hard, now is it? ;-)
  • >Yes, performant. Perfectly good word.

    What's wrong with "efficient"? Or just "fast"?
  • Because I mean something subtly different when I say "performant" than I mean when I say "fast" or "efficient".

    I know I've written about this before -- I think it was on Mike's blog, but my google-fu is failing me.

    Briefly, "efficient" means "high ratio of work to waste". If we say that waste is, say, processor cycles used unnecessarily, then clearly there is a correlation between performance and efficiency. But correlation does not two things the same make. (If so smart Yoda is, why not words right order in his sentences put?)

    Same with "fast" -- performance is about more than making something as fast as possible.
  • Re: bind contexts: interesting idea, but I try to avoid using things for purposes other than what they were intended. Bind contexts were invented to solve a particular problem and I'd find it misleading and weird to have code that re-uses them to solve some other problem.

    If I wanted to re-use a COM hash table, I'd cocreate the Scripting.Dictionary object long before I used a binding context.
  • What would the flag SCRIPTITEM_NOCODE | SCRIPTITEM_CODEONLY mean? I'm assuming you can set this combo since I didn't see anything disallowing it.

    Also, why use a WCHAR* instead of a BSTR? It looks like you're going to be using BSTRs internally everywhere and there are asserts preventing the null string from being used as something.

    Side question: do those asserts compile to anything in optimized builds or would Find(0) just crash and burn?
  • > Because I mean something subtly different when I say "performant" than I mean when I say "fast" or "efficient".

    Perhaps you could divulge what you do mean when you say "performant", then. It's clearly obvious to you, but just as clearly not obvious to me.

    (I, too, have had something of a Google-fu failure. I can find other people using the word, but all of them also seem to assume it doesn't require definition.)
  • > What would the flag SCRIPTITEM_NOCODE | SCRIPTITEM_CODEONLY mean? I'm assuming you can set this combo since I didn't see anything disallowing it

    It would be just like what it sounds like -- a named item which does not have any code, and doesn't have an object either.

    That, of course, would be completely pointless. But just because there's no point to that doesn't mean it should be illegal.
  • > Also, why use a WCHAR* instead of a BSTR?

    Why use it where? In the interface? Because that's how the interface is declared, so I have to use WCHAR* semantics, not BSTR semantics. I can't just arbitrarily redefine an existing interface; existing hosts do not pass BSTRs.

    Or did you mean "why was the interface designed that way?" Beats me. That was in 1995, before my time. Were I designing the interface, I'd have used BSTRs probably.
  • > Side question: do those asserts compile to anything in optimized builds or would Find(0) just crash and burn?

    As you can see from assert.h, in the retail build an assert is a no-op.

    The point of an assertion is to document what logically must be true, and to catch cases where your logic is wrong. Assertions don't add error handling or prevent crashes, they inform the testers of the causes of crashes in the debug build so that you can track the problem down faster. Passing NULL to Find will die horribly in the debug build too, but at least it will tell you why it's dying horribly before it does.
  • > Perhaps you could divulge what you do mean when you say "performant", then. It's clearly obvious to you, but just as clearly not obvious to me.

    I mean "performing well". That of course is begging the question -- to know what I mean by "performing well", uh, I guess read my performance archive.

    The gist that I try to get across when I talk about performance is that yes, performance is about speed and efficiency, but it is about speed and efficiency in a context.

    Let me give you an example. Compare calling a virtual function to calling a static function. Calling the virtual function is less efficient -- it has to do at least one extra indirection. It's also slower. That indirection can, in some cases, double the time spent making a function call.

    Does turning virtual functions into static functions make your program more performant? NO -- it is very unlikely that the performance of a particular program is gated upon the two billionths of a second that it takes to do that indirection vs the one billionth it takes to bind statically.
  • I think the resistance is just to the word "performant" itself, not the concept it represents. It's not a word that's used outside MS (although that's probably changing because other people are gradually picking it up and using it much to my chagrin).

    It's only slightly less grating on the ears than using "ask" as a noun, instead of the correct word "request" ;)
  • Eric, for someone on a long weekend far, far away from computers, you sure are posting a lot ;-)
  • > But just because there's no point to that doesn't mean it should be illegal.

    True, but just because something can be done doesn't mean it should be supported. While I would guess nobody has ever filed a bug related to this, it's generally easier to disallow something at creation than to test that you support it correctly everywhere it can be used.

    > Or did you mean "why was the interface designed that way?"

    Yes, that's what I was going after, thanks.

    I later found assert.h by checking out the SimpleScript category. Actually, I originally thought ScriptEngine was passing the name directly to NamedItemList and that the assert was the only validation on the input from a public API method. But I see now that there's a null check along that path.
Page 1 of 2 (16 items) 12