What are threading models, and what threading model do the script engines use?

What are threading models, and what threading model do the script engines use?

Rate This
  • Comments 18

 

I've got a few ideas for some future posts that depend on the reader understanding a little bit about COM threading.  Since I myself understand only a little bit about COM threading, I'll just do a brain dump for you all right here.

 

I'm sure you all know about multi-threaded applications.  The idea is that the operating system switches back and forth between threads within a process using a scheduling algorithm of some sort.  When a thread is "frozen" all the context for that thread -- basically, the values of the registers in the processor -- is saved, and when it is "thawed" the state is restored and the thread continues like it was never interrupted.

 

That works great right up to the point where two threads try to access the same memory at the same time.  Consider, for example, the standard implementation of IUnknown::Release():

 

ULONG MyClass::Release()
{

  --this->m_cRef;

  if (this->m_cRef == 0)

  {

    delete this;

    return 0;

  }

  return this->m_cRef;

}

 

Now suppose the ref count is two and two threads try to each do a single release.  That should work just fine, right?

 

Wrong.  The problem is that though  --this->m_cRef looks like a single "atomic" operation, the compiler actually spits out code that acts something vaguely like this pseudo-code:

 

  // "this" is stored in Register1

  1 copy address of "m_cRef" field of pointer stored in Register1 to Register2

  2 copy contents of address stored in Register2 to Register3

  3 decrease contents of Register3 by one

  4 copy contents of Register3 back to address stored in Register2

  5 compare contents of Register3 to zero, store Boolean result of comparison in Register4

  6 if Register4 is false then return contents of Register3

  7 ... etc -- do deletion, return zero

 

Notice that the compiler can be smart and re-use the contents of Register3 instead of fetching this->m_cRef three times.  The compiler knows that no one has changed it since the decrease.

 

So suppose we have a red thread and a blue thread.  Each has their own registers.  Suppose the processor schedules them in this order:

 

  1 copy address of "m_cRef" field of Register1 to Register2

  2 copy contents of address stored in Register2 to Register3

 

CONTEXT SWITCH! Save    Register1 = this, Register2 = &m_cRef, Register3 = 2

 

  1 copy address of "m_cRef" field of Register 1 to Register 2

  2 copy contents of address stored in Register 2 to Register 3

 

CONTEXT SWITCH! Save    Register1 = this, Register2 = &m_cRef, Register3 = 2

CONTEXT SWITCH! Restore Register1 = this, Register2 = &m_cRef, Register3 = 2

 

  3 decrease contents of Register3 by one

  4 copy contents of Register3 to address stored in Register2

  5 compare contents of Register3 to zero, store Boolean result of comparison in Register4

  6 if Register4 is false then return contents of Register3

 

Register 4 is false because Register3 = 1, so this returns 1.

 

CONTEXT SWITCH! Restore Register1 = this, Register2 = &m_cRef, Register3 = 2

 

And now you see where this is going, I'm sure.  Because the original value was stored in the red thread before the blue thread decremented it, we've lost a decrement.  Both threads will return 1. This object's ref count will never go to zero, and its memory will leak.  A similar problem plagues AddRef -- you can lose increment operations, which causes memory to be freed too soon.

 

How do we solve this problem?  Basically there are two ways to do it:

 

1)      Do the necessary work to ensure thread safety

2)      Require your callers to behave in a manner such that you never get into this situation in the first place.

 

The operating system provides tools to make multi-threaded programming work.  There are methods like InterlockedIncrement, which really do "atomically" bump up a counter. Signals and semaphores and critical sections and all the other tools you need to make multi-threaded programs are available. I'm not going to talk much about those.

Writing a truly free-threaded program is a lot of work.  There are a lot of ways to get it wrong, and there are potential performance pitfalls as well.  Fortunately, there is a middle ground between "only one thread allowed" and "any thread can call any method at any time". 

The idea of the COM threading models is to provide a contract between callers and callees so that, as long as both sides follow the contract, situations like the one above never come to pass.

 

Suppose a caller has several instances of an object (the callee), and the caller has several threads going.  The commonly used standard threading contracts are as follows:

 

* Single threaded -- all calls to all instances of the object must always be on the same thread.  There are no synchronization issues because there is always only one thread no matter how many object instances there are.  The caller is responsible for ensuring that all calls to all instances are on the same thread.

 

* Free threaded -- calls to the object can be on any thread at any time, including multiple threads at the same time.  The object is responsible for all synchronization issues.

 

* Apartment threaded -- all calls to any given instance of the object must always be on the same thread, but different instances can be called on different threads at the same time.  The caller is responsible for ensuring that given an instance, all calls to that instance happen on the same thread.  The object is responsible for synchronizing access to global (that is, not-per-instance) data that it owns. 

An analogy might help. Think of an apartment building where each apartment is a thread and each person is an object instance.  You can put as many people into one apartment as you want, and you can put people into lots of different apartments, but once you've done so, you always have to go to a person's apartment if you want to talk to them.  Why? Because they never move out once they're in an apartment, you have to wait for them to die before they ever leave.  (Insert New Yorker joke here.) 

Furthermore, you can't talk "through the walls" from one apartment to someone in another apartment.  (Well, actually you can -- that's called "marshaling", and that's a subject for a future post.)  And finally, if the people jointly own a shared resource -- say, a rooftop barbecue, to stretch this silly analogy to its limit -- then they must sort out amongst themselves how to synchronize access to the shared resource.

 

* Rental threaded -- calls to an object can be on any thread but the caller guarantees that only one thread is calling into the object at any time.  Rental threading requires a different analogy: suppose the object instances are rented televisions and again  threads are apartments.  A television can be moved from apartment to apartment but can never be in more than one apartment at the same time. Multiple televisions can be in the same apartment, and multiple apartments can have multiple televisions.  But if you want to watch a television, you have to go to the apartment where it is.

Whew, that was a long preamble.  How does this pertain to the script engines?

 

Most COM objects -- almost all ActiveX objects, and all of the object models commonly used by script -- are apartment threaded objects.  They expect that multiple instances of the object can be created on multiple threads, but once an instance is created on a thread, it will always be called on that thread.  This gives us the best of both worlds -- the caller can be free threaded and can create multiple objects on multiple threads, but the callee does not have to synchronize access to any per-instance data.

 

But the script engines are free threaded objects. The script engines must ensure that they do not violate the apartment model contract. 

 

So guess what?  The script engines actually have a bizarre, custom contract that is a little more restrictive than free threading and less restrictive than apartment threading! 

 

The script engine contract is as follows:

 

* When the script engine is in a state where it cannot possibly call an ActiveX object -- for instance, if it has just been created and has not started running code, or if it is just about to be shut down -- then the script engine really is free threaded, but who cares?  It can't do much in this state.

 

* When the script engine is initialized -- when the script engine host has started the process of passing code and object model state to the engine -- the script engine morphs into an apartment threaded object.  All calls to the script engine must be on the initializing thread until the script engine is shut down again.

 

* There are two exceptions to the previous rule -- the InterruptScriptThread and Clone methods can always be called from any thread.  The former is the mechanism whereby the host can signal a long-running script that it is taking too long and needs to shut down.  Obviously that has to be on a different thread than the one that is taking too long! The latter is how ASP starts a second request for a currently-executing page to run on a second thread without recompiling the script. 

 

* If the script host (the browser, for instance) violates this contract, the script engine will usually return the E_UNEXPECTED error, which has the helpful error message "CATASTROPHIC FAILURE".  (Not that we're trying to scare you or anything; it's just our way of saying "please don't do that".)

 

Anyway, that was a whole lot of not particularly germane information unless you are developing your own script host.  But like I said, I have a few ideas for some future topics that will assume some background understanding of COM threading models.

 

  • Great article! I was just about to suggest to you to talk about threading issues in VBScript and JScript and you beat me to it. The more I've been pushing JScript inside of IE, the more threading issues come into play, so the more light you can shed on it, the better. Granted IE's handling of HTML parsing tends to be the biggest pain, but I can generally find work arounds. There are times though that I would have loved to have had something run on a "background thread", but have yet to figure out how to do that.
  • Thanks, I'm glad you enjoyed it. I was planning on taking this thread (no pun intended) towards ASP, where there are very serious threading considerations. I'll see if I can come up with anything interesting from the IE perspective. It's very difficult to do true multi-threading inside IE. However, there are clever things you can do with the setTimeout method. I've also been thinking that it might be interesting to describe setTimeout from a continuation-passing-style perspective.
  • "I've also been thinking that it might be interesting to describe setTimeout from a continuation-passing-style perspective." That would be fantastic.
  • Not exactly setTimeout, but asynchronous HTTP/SOAP really gets some benefit from closures: http://www.cabezal.com/blog/archives/000607.shtml
  • It's always struck me as a little bit odd that Active Server Pages, a web server, encourages developers to use VBScript and JScript to write server-side scripts. I mean, the whole point of a web server is that it produces complex strings as blindingly fast as possible, on demand.
  • Eric, it will be great if you could expand on how to apporach multithreading on IE. Also some insights on setTimeout & setInterval really work would be really welcomed :-)

    Thanks for your great posts!
  • Attempting true multithreading in IE is a bad idea.

    Unfortunately, as I said in my post "Confessions of a Language Designer", I really know very little about the implementation of the IE object model. I've never so much as looked at the OM source code, and I certainly haven't written very many "real world" web pages.
  • You know very little about IE OM but good enough to say that multithreading is a bad idea... I'm guessing that little knowledge would be great to share :-)

    Thanks again!
  • This post comes very close (I think) to addressing an issue I've been having when cloning a script engine across threads, but I was wondering if you could expand on it a bit.

    Scenario: Thread A creates a script engine and loads it with code (JScript in this case), but does not execute it. Later, Thread B clones the script engine created by Thread A, runs it, then destroys its local copy (the clone). All is find up to that point. Then later, Thread A tries to destroy the original copy of the script engine, at which point, I get the CATASTROPHIC ERROR you mentioned.

    What am I doing wrong?
  • That sounds right to me. What function are you calling when you get the error?

    Can you send me the simplest possible program that reproduces the error?
  • Hi Eric. I am REALLY interested to find a way to implement true multi-threading within IE. I've had to create a custom garbage collector for my AJAX application to ensure when HTML is refreshed within an element, all event handlers and closures referencing those elements get cleaned up, thus preventing memory leaks in long-running instances of the same document.

    I know all about setTimeout (and Interval), and I use that to kick off the cleanup process, but with 100+ objects in the cleanup queue, the UI of the application still blocks (which is expected since setTimeout is not truely a threading mechanism).

    You said "It's very difficult to do true multi-threading inside IE" That implies that it is possible. Please please please explain! I have googled this topic all day. The closest thing I've found is the information that Sun's new Java 6.0 ports with a Rhino-based JavaScript engine, which sounds great except that I'm targetting IE. Please help =)
  • I have been developing some multi-window browser-based client software, making extensive use of JavaScript.  In order to maintain appropriate state, I keep a list of open windows in a JavaScript object variable located in the top level of the primary window.  This window also contains a library of JavaScript functions to perform various tasks, including the opening of a new window (and update of the object) or the removal of a window from the object.  In addition, there is a function which processes periodic information from the web server, routing the information to the appropriate "child" window.

    When a "child" window closes, it calls a function on it's opener (the primary window) to remove itself from the window-list object.

    When this happens while a message is being routed to that window, bizarre JavaScript errors occur.

    Essentially, the onunload of the "child" window calls "opener.CloseWindow":

    function CloseWindow(id)
    {
     windowlist[id] = null;  // could use 'delete' as well, but want lowest common denominator
    }

    Meanwhile, a frame on the primary window is calling "top.AddMessage":

    function AddMessage(id,msg)
    {
     var w = windowlist[id];
     if (w != null)
     {
       // make several references to w to do stuff with the info
       w.this = ...
       w.that = ...
       w.document.something = ...
     }
    }

    I am getting an error inside of the AddMessage function when the "child" window closes, usually one of the following:
    Unspecified error.
    Access is denied.
    Permission denied.
    The callee (server [not server application]) is not available and disappeared; all connections are invalid.  The call did not execute.

    Sometimes IE actually crashes:
    The instruction at "0xXXXXXXXX" referenced memory at "0xXXXXXXXX". The memory could not be "read".
    where both 0xXXXXXXXX are usually the same (indicating an attempt to execute an invalid memory location).

    When I attach the Visual Studio 6.0 SP4 debugger to IE after a trappable error has occurred, I find that the point of execution is invariably inside the "if" statement inside of the AddMessage function, and that the value of "w" is null!

    Rather than panic, since I am aware that the JavaScript interpreter provided by IE (or the Windows Scripting Host) may be multithreaded, I looked for ways to solve the problem.

    Initially I wrote a spin-lock function which provides mutual exclusion between the AddMessage and CloseWindow functions.  What I found this time was even MORE bizarre.  I attached the debugger after IE complained of a script taking too much time and this is what I saw in the stack:

    SpinLock(1,true)                              // Lock for CloseWindow
    CloseWindow
    AddMessage                                   // Inside the if
    ...

    Since AddMessage had already acquired the lock, the attempt by CloseWindow to acquire the lock ON THE SAME THREAD caused a deadlock, since the functions are mutually exclusive.  Of course, I didn't expect any thread to try to acquire both since AddMessage doesn't CALL CloseWindow, or vice versa.  I would expect the calls to occur on separate threads with separate stacks.  There were multiple threads in the debugger, but the other two threads both had no stack information.  It looked like the JavaScript interpreter had literally pushed the call to CloseWindow onto stack for the thread running AddMessage, essentially interrupting the call to AddMessage at some arbitrary point AS IF AddMessage had made the call.  This seems like a broken method of implementing multi-threaded behavior.

    I tried returning and using results from AddMessage and CloseWindow, but this still evinced the same behavior.  I guess the return value for CloseWindow got whisked off to the appropriate location.

    I also tried using the window.onerror handler to avoid any error messages in AddMessage, since, if the window is closing, we don't care about dispatching the message.  Again, I got interesting results.  Errors were skipped as long as the window referred to in the window-list was open (I deliberately introduced an error into the code to test this) but the minute I closed the "child" window during a call to AddMessage, I got an error.  In fact, the error I got was for the erroneous line I introduced into the code.  It was as if the onerror handler I was setting was for the WRONG window, but only when the "child" window was closing.  What baffles me is that AddMessage is not even called by and of the "child" windows; it is only called by a frame on the primary window.

    Any insight into IE / JScript's threading would be great!

    -Tom


    Thomas S. Trias
    Senior Developer
    Artizan Internet Services
    http://www.artizan.com/
  • Someone asked me if the "asynchronous" in AJAX means we should be worried about thread safety and global...

Page 1 of 2 (18 items) 12