Dino's Blog

.NET Stuff

  • World's Worst Paint Program

    I've uploaded the World's Worst Paint Program, as seen in my PDC talk, to IronPython's CodePlex site:

     http://ironpython.codeplex.com/Release/ProjectReleases.aspx?ReleaseId=28125#DownloadId=93331

     This is built using Visual Studio 2010 Beta 2 and the IronPython CTP for that release - so you'll need to download both of those.  I've also included the Python and Ruby code that I used during the demonstration in a file called code.txt.  The paint program it's self is in the final form at the end of the demo.

  • Two IronPython releases in 1 week

    This was a good week for IronPython - we've released not one but two new versions!

    The first release was IronPython 2.0.2 which just includes bug fixes for issues which have been particularly irritating to our users.  This is a very minor release and doesn't include breaking changes.  Probably the most significant fix is that we've fixed ngen on 64-bit platforms.  Previously when you would install 2.0 and enable ngen it would only ngen the binaries for 32-bit assemblies.  Now we'll ngen for 64-bit which should help users running on 64-bit see much faster startup times.  I hope that this will make 2.0 much more usable until we get the really big improvements that 2.6 brings.

    And speaking of 2.6 the other release this week was our 2nd and final beta.  2.6 is shaping up to be an exciting release.  We've significantly improved the startup time of IronPython - this has been our number one requested change.  We've implemented more standard modules and other missing functionality to be closer to CPython.  This includes the new ctypes module, support for sys.settrace and pdb, as well other new functionality that 2.6 has introduced such as the underlying support for the io module.  And finally we've added a bunch of new features to improve .NET interop as well - we've now got support for generic method type inference and added the core support that will enable using .NET attributes and in general working better w/ .NET when static types are expected.  I'm pretty excited about this release and can't wait to get the final version out - so please try out this beta and let us know if you run into any issues.

  • On Performance

    Ahh, performance…  It’s so much fun!  It’s easy to measure and as you work on it you can see immediate benefits (or not!).  I personally enjoy getting to spend time looking at interesting performance issues and I’ll use this as a time to plug IronPython 2.6 Beta 1 where we’ve continued our focus in this area.  Even more fun yet is when you compare and contrast different implementations - it can be both surprising and amusing.  Recently Robert Smallshire has been doing just that: http://www.smallshire.org.uk/sufficientlysmall/2009/05/22/ironpython-2-0-and-jython-2-5-performance-compared-to-python-2-5/ Obviously Robert has found a case in IronPython where something is performing horribly.  I’d like to take a look at what’s happening here and what we can do to improve this case. 

    But before I look at what’s going on here I’d like to briefly say we love to hear feedback from users of IronPython.  We’ve have a mailing list you can join and give us your feedback on and bugs can be reported on our website.  We track both of these very closely and respond as quickly as we can – which is usually pretty fast.  I just mention that because it’s easier for us to improve IronPython when people let us know about a problem vs. us having to scour the web to find issues.  Hopefully this piqued interest will have more people trying out IronPython and giving us feedback.

    Ok, but let’s get onto the fun technical details!  It all comes down to one line of code “Node.counter += 1”.  That’s a pretty innocent looking piece of code, how can it cause so many problems?  First you need to understand a little bit about how IronPython works.  IronPython internally keeps versions of type objects and uses this information to generate optimal code paths for accessing those objects.  So the next line of code in this example, “self._children = children” usually turns into something that’s like:

    if (self != null && self.GetType() == typeof(PyObject) && self.PythonType.Version == 42) {

         self._dict["_children"] = <new value>;

    }

    What that one line of code causes then is for the version of the type to change and for this “rule” to be invalidated.  So we’ll figure out what to do, generate a new cached rule, only to have it not work the next time around.  The same thing applies to the calls to Node – we re-generate the “optimal” code again and again only to discover we have to throw it away.

    So that sounds pretty bad but if you’re familiar with Python’s member access rules you’ll quickly recognize that this will be much faster than needing to do the resolution again and again.  That involves looking through the class hierarchy to see if we have any data descriptors or classes defining __setattr__ and it's nice to avoid all of that work.   So to get the best performance we require for the world to settle down and not be continuously changing.  We’re well aware of this problem with our implementation and we have continued to focus on improving it.  IronPython 2.0 already includes one mitigation so that we don’t need to continuously generate new code – instead we effectively patch new values into an existing piece of code.  In our IronPython 2.6 branch we’ve continued to improve performance here and significantly reduced the expense when things get mutated.  But it turns out that there’s an easy fix for this particular problem we’ve just never thought it that important of a problem to solve.

    One of the amusing things about this is we’ve actually seen it before – Resolver Systems was having some performance issues with Resolver One  which is their advanced spreadsheet which is written in IronPython.  One of those issues turned to be a type which they both used frequently and mutated frequently.  This actually doesn’t seem to be a common idiom in Python code which is the reason we’ve never optimized for it before.  In this case though it turned out that they had found an optimization that improved performance on IronPython 1.0 but was destroying perf on IronPython 2.0!  For Resolver fixing this was a simple change for them to no longer mutate types.  Even if you really want to use a type as a property bag you can always move that to a type which you aren’t creating instances of and executing methods on.

    With all that in mind let’s look at some numbers.  Here I’ve compared IronPython 2.01, IronPython 2.6 Beta 1, CPython 2.6 plus IronPython 2.6 with a fix for this issue.

    Perf Comparison

     

    Like Robert’s charts I have tree size in number of nodes across the bottom and execution time in seconds across the top and the execution times are logarithmic.  We can see the abysmal performance of 2.0.1, we can see that 2.6B1 has already made significant improvements, and with IronPython 2.6B1 plus a fix for this specific issue we actually beat CPython as the number of iterations scales up.

    But there’s another way we can look at this – instead of fixing IronPython let’s look what happens if we fix the benchmark to avoid this performance pitfall.   To do this I’ve replaced the mutation of the class with a global variable.  When running against all of the same implementations again I get these results:

    perf 2

     

    Here you can see that all 3 versions of IronPython are performing about the same and are coming in significantly faster than CPython as the number of iterations goes up.  These are now also matching the numbers Robert is getting when he has made the same change.

    So you might be curious - what’s the fix?  Well in this case we’re just mutating the type by changing a simple value stored in the type (versus say adding a property or a method to the type).  Today we get a very slight throughput performance benefit because we can burn the value directly into the generated code instead of fetching it from the wrapper the type object keeps it in.  With the fix we’ll burn the wrapper in and fetch the value from the wrapper each time.  Then when we go and modify the type we just modify the wrapper instead of the type and its version is unchanged!  This fix is now checked into the IronPython source tree and you can download the fixed sources right now. 

    In conclusion I’d like to wrap up a little bit about our performance philosophy – performance is hard and deciding which numbers matter to you is important.  We have tried to tune IronPython for apps which run for more than a fraction of a second.  We have also tuned IronPython for code which is not constantly mutating types at runtime.  And we’re always open to feedback when users encounter problems with IronPython and we’ll listen to what scenarios are important for them.  But if you look at these graphs I think you can see that we’ve done a good job at making IronPython a fast implementation of Python.

     

  • A Simple DLR Binder

    A lot of people find it a little confusing to write their first binder to consume dynamic objects running on the DLR.  It’s not actually super simple but it’s not too hard either.  To shine some light on this I thought I’d post a very simple binder that shows how you plug in.  This example just supports adding 2 ints.   If the target object is an IDynamicObject the object will get the first crack at implementing the operation.  So for example a class written in IronPython which implements __iadd__ will call the __iadd__ method to do the addition (ok, that’ll be true once I support the current DLR way of doing operations, we’re still using an older version).  For objects of any other types which don’t implement IDynamicObject it will report an error indicating the types and the operation kind.  Even if you specify a different operation kind it will still just add ints. 

     

    There’s a few key pieces you need to understand here.  First you’re returning an expression tree back to the DLR to tell the DLR what to do.  The DLR will compile this expression tree and run it.  You’re also returning a set of “restrictions”.  These will be evaluated by the DLR to see if it can re-use the same code for other objects in the future.  You’re unlimited in what you can do with restrictions as they can be arbitrary Expression trees.  But here I’m simply restricting based upon the arguments .NET types which is one of the most common restrictions.  Both the expression tree and restrictions get packaged up in a DynamicMetaObject which is conveniently the same type of object you receive as your arguments. 

     

    Finally there’s the hash identity which is how the DLR knows if it can share rules or not amongst different binders.  The base DLR binders override GetHashCode/Equals to hash on the operation, member name, etc… depending on the binder type so I just return this here.

     

    So here it is a simple binder.  This targets the latest and greatest DLR sources which has experienced some renames and so isn’t compatible with IronPython 2.0.  But you can always get the latest IronPython source code which this will compile against at http://www.codeplex.com/IronPython/SourceControl/ListDownloadableCommits.aspx

     

        class MyOperationBinder : BinaryOperationBinder {

            public override DynamicMetaObject FallbackBinaryOperation(DynamicMetaObject target, DynamicMetaObject arg, DynamicMetaObject errorSuggestion) {

                if (target.LimitType == typeof(int) && arg.LimitType == typeof(int)) {

                    return new DynamicMetaObject(

                        Expression.Add(

                            Expression.Convert(target.Expression, typeof(int)),

                            Expression.Convert(arg.Expression, typeof(int))

                        ),

                        BindingRestrictions.GetTypeRestriction(target.Expression, typeof(int)).Merge(BindingRestrictions.GetTypeRestriction(arg.Expression, typeof(int)))

                    );

                }

     

               

                return new DynamicMetaObject(

                    Expression.Throw(

                        Expression.New(

                            typeof(Exception).GetConstructor(new Type[] { typeof(string) }),

                            Expression.Constant(

                                String.Format("Can't perform operation {0} on {1} and {2}", this.Operation, target.LimitType, arg.LimitType)

                            )

                        )

                    ),

                    BindingRestrictions.GetTypeRestriction(target.Expression, typeof(int)).Merge(BindingRestrictions.GetTypeRestriction(arg.Expression, typeof(int)))

                );

            }

     

            public override object CacheIdentity {

                get { return this; }

            }

        }

     

  • IronPython, MS SQL, and PEP 249

    Over the past week and a half I've spent a little bit of time getting Django 0.96.1 running on IronPython.  The 1st step in doing this was getting a database provider that would run on .NET that would work with Django.  For DB backends Django basically follows PEP 249 with a few extensions.  Here's the basic DB provider I came up with.  It's not quite complete but it was good enough for me to get Django's tutorial running.  To use this you just need to copy one of the existing DB backends and replace the code in base.py with the code below.  For the rest of the code you can pull it from ado_mssql. 

     Basically this just uses the .NET System.Data namespace to do the communication with the database server.

    import clr
    clr.AddReference('System.Data')
    from System import Data
    from System.Data import SqlClient
    import System

    DatabaseError = Data.DataException

    from threading import local

    class SqlCursor(object):
        arraysize = 1
       
        def __init__(self, connection):
            self.connection = connection
            self.transaction = None
            self.reader = None
            self.record_enum = None
       
        def execute(self, sql, params=()):       
            parameters = []
           
            # translate to named parameters
            if type(params) in (list, tuple):
                # indexed params, replace any %s w/ @GeneratedName#
                if sql.find('%s') != -1:
                    cmd = ''
                    sqlSplit = sql.split('%s')
                    for text, value, index in zip(sqlSplit, params, range(len(params))):
                        cmd += text + '@GeneratedName' + str(index)
                        parameters.Add(SqlClient.SqlParameter('@GeneratedName' + str(index), str(value)))
                    
                    sql = cmd + sqlSplit[-1]
            else:
                for name, value in params.iteritems():
                    sql = sql.replace('%(' + name +')s', '@' + name)
                    parameters.Add(SqlClient.SqlParameter('@' + name, str(value)))
           
            command = SqlClient.SqlCommand(sql, self.connection)
           
            for param in parameters: command.Parameters.Add(param)
           
            self.record_enum = None
            self.reader = command.ExecuteReader()
       
        def close(self):
            self.reader.Close()
       
        def executemany(self, sql, param_list):
            res = []
            for s in sql:
                res.append(execute(s, param_list))
            return res
           
        def fetchall(self):
            return [self._make_record(record) for record in self.reader]           
       
        def fetchone(self):
            if self.record_enum is None:
                self.record_enum = iter(self.reader)
            if self.record_enum.MoveNext():
                return self._make_record(self.record_enum.Current)
            return None
           
        def fetchmany(self, size=None):
            if size is None: size = SqlCursor.arraysize
            res = []
            for i in range(size):
                x = self.fetchone()
               
                if x is None: break
               
                res.append(x)
            return res

        def _make_record(self, record):
            return tuple((self._fix_one_record(record[i]) for i in xrange(record.FieldCount)))

        def _fix_one_record(self, record):
            if type(record) is System.DateTime:
                return datetime.datetime(record)
           
            return record
       
        @property
        def rowcount(self):
            if self.record is not None:
                return self.reader.RecordsAffected
            return -1
       
    class DatabaseWrapper(local):
        def __init__(self, **kwargs):
            self.connection = None
            self.queries = []
            self.transaction = None

        def cursor(self):
            from django.conf import settings
            if self.connection is None:
                if not settings.DATABASE_HOST:
                    settings.DATABASE_HOST = "(local)"
                if settings.DATABASE_NAME == '' or settings.DATABASE_USER == '':
                    conn_string = "Data Source=%s;Initial Catalog=%s;Integrated Security=SSPI;MultipleActiveResultSets=True" % (settings.DATABASE_HOST, settings.DATABASE_NAME)
                else:
                    conn_string = "Data Source=%s;Initial Catalog=%s;User ID=%s;Password=%s;Integrated Security=SSPI;MultipleActiveResultSets=True" % (settings.DATABASE_HOST, settings.DATABASE_NAME, settings.DATABASE_USER, settings.DATABASE_PASSWORD)
               
                self.connection = SqlClient.SqlConnection(conn_string)
                self.connection.Open()
               
            cursor = SqlCursor(self.connection)
            if settings.DEBUG:
                return util.CursorDebugWrapper(cursor, self)
            return cursor

        def _commit(self):
            if self.transaction is not None:
                return self.transaction.Commit()

        def _rollback(self):
            if self.transaction is not None:
                return self.transaction.Rollback()

        def close(self):
            if self.connection is not None:
                self.connection = None
                self.transaction = None

  • New opportunities...

    After a little over 4 years it’s time for me to go and explore new opportunities… It's awesome to see Whidbey out there now and I'm sure the work we did to improve the reliability of the CLR and embed it in SQL Server will pay off for our customers.  But it's time for me to go and try something new - so I'm switching from the CLR team to... join the CLR team :).

    I'll be switching from my current test role over to development to work on on IronPython and improving the CLR's support for scripting languages in general.  I started on the new team last Monday and it's already been a lot of fun.  Today we've shipped our first release since I joined (0.9.5 - which you can download here.  While this is the first release since I've joined the team it's the second release to which I've contributed.  This release brings a bug fixes, support for several new built-in types and improvements when using Python objects from WPF. 

    So expect to see less about reliability and hosting (when I post :) ) from me over time and more about IronPython and scripting.  I still have some more I want to say about reliability and hosting so there should be an interesting mix going forward.

    Anyway, I encourage everyone to check out our latest release and let us know what you think!

     

  • Fiber mode is gone...

    Well, now that our RTM and RC-esque builds are starting to make it out into the wild it seems like a good time to discuss a feature we had to make go away...

    That feature which most readers of my blog are probably now familiar with is fiber mode.  So why did we cut fiber mode for the released version of the product?  And what changes will you see because of this?

    We had gone through great efforts to make sure that all of our core functionality worked great inside of a fiber mode host.  Not only did the CLR team do a lot of work to make sure fiber mode was solid, but so did many other teams who picked up our toolset and verified that worked properly by running their tests inside of fiber mode.  Not to mention the teams that typically run their tests inside of Yukon who also make sure their features work in fiber mode.  We even spent extra time at the end of beta 2 to make sure we got all the last minute fiber bugs fixed from our full fiber mode test pass - but ultimately it wasn't enough.

    In the end we had one final exit criteria we'd have to pass in order to make sure fiber mode was solid:  Stress.  Between the choices of spending our time fixing the bugs that we knew ALL of our customers would hit in thread mode, or fixing the bugs that a small percentage of users would hit in fiber mode, we picked fixing thread mode bugs.  In the end I think most customers will be happy as the SQL and CLR teams came together and we've spent a huge amount of time, resources and effort ensuring that SQL/CLR works great under thread mode.

    For those of you wanting to develop a fiber mode CLR solution I think you need to first ask yourself why you're doing this.  If you're attempting to conserve stack space then this is not the solution you're looking for.  If you're attempting to reduce the number of context switches experienced then you can still get much the same results using thread mode and blocking "switched out" tasks on an event that gets signaled when it's their turn to run.  And of course there's a 3rd, although usually less desirable, option to redesign the way you're approaching the problem of a large number of work items.

    Finally what are the changes in removing fiber mode?  Surprisingly everything will work exactly the same as it did in pre-RC and RTM builds except for one API will now return E_NOTIMPL.  That one API is ICLRTask::SwitchOut.  You still can do all the interesting stuff with integrating the CLR task management with your own environment, and you can still switch a logical task in on a thread that previously had another logical task running on it, but you can't switch out a logical task during the middle of its lifetime. 

    I realize removing this feature will cause some of you some pain but I hope you'll see that this was the right call for us to get a super stable Whidbey out the door in a reasonable timeframe.

     

     

  • Cooperative Fiber Mode Sample - Day 11

    Last week we went over the Abort and RudeAbort APIs.  This week we’ll go over Fiber.Exit.  This API provides a way for you to terminate a running fiber, and demonstrates a use of the unmanaged ICLRTask::ExitTask API.   The ExitTask API can only be called on a fiber that is currently switched in which gives us a similar problem to what we had in Abort/RudeAbort.  But because the ExitTask call will not block for a long period of time we handle it a little differently than Abort and RudeAbort.

     

    After checking & updating our state like we did in Abort and RudeAbort, the ExitTask API does:

     

    if (this == Fiber.CurrentFiber)

    {

          // exiting a running task is easy

          InternalExitTaskAndThread(fiberAddr);

    }

    else

    {

          // only tasks that are running can call ICLRTask::ExitTask.

          // We need to switch the task in, call ExitTask, and then

          // switch back to our current fiber.

          InternalExitTaskAndSwitch(fiberAddr, Fiber.CurrentFiber.m_fiberAddr);

    }

     

    Here you can see we take one of two actions.  Either we call InternalExitTaskAndTherad() which will cause both the current thread and fiber to exit, or we call into InternalExitTaskAndSwitch. 

     

    The first call is pretty self-explanatory.  We’re the current thread, we want to let the whole thing die off – this is exactly the same to calling DeleteFiber on the currently running thread.

     

    The second call is a little different.  Here we have a fiber that is currently switched out.  We want to re-schedule the fiber we’re asking to exit, call ExitTask on it, and then switch back to our previously running fiber.   Here we go back to the code in SwitchIn that we looked at in article 4:

     

    if(curTask->FlagCheck(TASK_FLAG_EXITING))

    {

           _ASSERTE(curTask->GetSwitchingTo());

           curTask->ExitTask();

           curTask->GetSwitchingTo()->SwitchIn();

    }

     

    The task has been requested to exit, we call ICLRTask::Exit on it, and then we switch back to the previously running task having completed the work the user requested.

     

    One note worthy point here is that like the DeleteFiber APIs, the ExitTask API is potentially unsafe.  This is because we’re exiting the thread with code higher us still on the stack.  This code may have resources to free or other state to clean up.  Typically a host would only call ExitTask after all managed code has left the stack. 

     

    That brings us to the end of the CoopFiber series.  We’ve gone over the smallest set of APIs that are required for implementing fiber mode.  We’ve then exposed the APIs to the managed code author to allow them to be in control of the scheduling.  To implement a scheduler on top of this would require updating the synchronization primitives so they’d select new tasks and switch them in instead of blocking the thread.  And hopefully you’ve walked away from all of this with a better understanding of the interactions between the CLR and the host when running in fiber mode. 

  • Cooperative Fiber Mode Sample - Day 10

    In this article we’re going to look at the implementation of Abort and RudeAbort.  In the last article I mentioned that a typical host wouldn’t need to expose these to managed code.  Why is that?  If you have a fiber mode scheduler typically Thread.Abort will be sufficient.  Your scheduler will eventually schedule the task, and the task will then be aborted.

     

    But in the CoopFiber sample we don’t have a scheduler.  Therefore a user could call Thread.Abort on a switched out fiber, and it would never get scheduled to be aborted.  What’s worse is that the call into ICLRTask::Abort will block until the thread gets scheduled again.  

     

    To handle this peculiarity when an Abort request comes in the Fiber class first determines whether the task is running or not.  If it is already running then we’ll go ahead and abort it right away.  Otherwise we’ll set a bit to notify that on the next switch we should abort the task.  Here’s the Fiber.Abort implementation:

     

    bool fAbortNeeded = false;

     

    lock (m_syncObj)

    {

          m_FiberStates |= FiberStates.AbortRequested;

          if ((m_FiberStates & FiberStates.Running) != 0 && (m_FiberStates & FiberStates.Switching) == 0)

          {

                //the current fiber is aborted, so we can initiate the abort now.

                fAbortNeeded = true;

          }

    }

    if (fAbortNeeded)

    {

          InternalAbort(m_fiberAddr);

    }

     

    InternalAbort will just turn around and call ICLRTask::Abort on our already switched in task.  In our last article we saw our calls to CheckForAbort during task switch in.  That code simply looks like:

     

    // Next see if the user requested an abort on this thread.

    if ((m_FiberStates & FiberStates.AbortRequested) != 0)

    {

    Abort();

    }

    else if ((m_FiberStates & FiberStates.RudeAbortRequested) != 0)

    {

          RudeAbort();

    }

               

     

    We simply call back to Abort or RudeAbort, which will then call into InternalAbort to finally Abort/RudeAbort the task.  The task is now switched in and running so we’ll call into the host to perform the actual abort (which now won’t block because the task is running).

     

    At this point you may be asking yourself “What is this Rude Abort thing?”  Rude aborts are actually a new Whidbey feature, and can be accessed one of several ways.  The only difference between a rude abort and a normal thread abort is that the rude abort will not execute your catch/finally blocks.  In addition to being exposed directly through the hosting APIs rude abort’s can also occur due to escalation policy (also configured via the hosting APIs).  That’ll have to be saved for another blog entry…

     

    CoopFiber’s Rude Abort implementation is essentially identical to the Abort implementation.  The only difference is that we’ve replaced the work Abort with RudeAbort everywhere.

     

    So that’s it for the Abort/Rude Abort implementation.  We’ve almost reached the end of this serious, but there’s one last API that’s worth covering on the managed size: Fiber.Exit.  I’ll go over that implementation next article.

  • Cooperative Fiber Mode Sample - Day 9

    The managed Fiber class exists in its own directory in the SDK sample appropriately called Fiber.  The Fiber class is designed to be vaguely similar to the managed Thread class.  For example, like the Thread has ThreadState the fiber has FiberStates.

     

    The managed fiber class exposes a few significant APIs:

    ·        public static Fiber CreateFiber(FiberStart fs)

    ·        public static Fiber CurrentFiber

    ·        public void SwitchTo()

    ·        public void Abort()

    ·        public void RudeAbort()

    ·        public void Exit()

    ·        public FiberStates FiberState

     

    The CreateFiber API should be obvious.  It takes a FiberStart delegate and passes it into the unmanaged InternalCreateFiber API we previously looked at.

     

    CurrentFiber is nearly just as obvious: We store the current fiber in managed thread local storage (or in our case it’s really fiber local storage) so we can easily fetch it.  If we haven’t created a managed Fiber object for this fiber yet we’ll get it from the host via a P/Invoke into the InternalGetCurrentFiber API.  This simply gets the currently executing CHostTask, AddRef’s it, and returns it.  The managed half simply sticks this into our SafeHandle to ensure it’s properly cleaned up.

     

    The last trivial method here is FiberState.  This just returns m_FiberStates which the managed Fiber class maintains as the fibers switch through various states.

     

    The interesting APIs here though are actually SwitchTo, Abort, and RudeAbort.  We’ll start at looking at SwitchTo.

     

    The first thing we watch out for in the managed implementation is that we don’t try and switch a fiber in on 2 threads:

     

    Fiber curFiber = CurrentFiber;

     

    lock (m_syncObj)

    {

          if ((m_FiberStates & (FiberStates.Running|FiberStates.Switching)) != 0)

          {

                throw new FiberStateException("Attempt to switch to a fiber that is running or already switching!");

          }

          m_FiberStates &= (~FiberStates.Unstarted);

          m_FiberStates |= FiberStates.Running;

    }

     

    If the fiber was already running, or is currently involved in a switch, then we’ll throw an exception that you cannot currently switch.  We’ll then update the fiber we’re trying to switch in so it’s now officially running (this will prevent anyone else from switching it in).

     

    Next we need to make sure that no one races in and tries switching in the fiber we’re switching away from. 

     

    lock (curFiber.m_syncObj)

    {

          m_FiberStates |= FiberStates.Switching;

    }

     

    // we've marked us as switching, do one last check to see if someone wanted to abort us.

    // Anyone after this will get us aborted on our next switch in.

    CheckForAbort();

     

    So we update its state to mark that it’s currently switching out.  You’ll also notice we’re checking to see if we need to abort a thread.  We’ll cover this more in-depth in the next article where I discuss the Abort & RudeAbort implementation.

     

     

          m_prevFiber = curFiber;

     

          // switch in the fiber the user requested

          InternalSwitchIn(m_fiberAddr);

     

          curFiber.OnFiberSwitchIn();

     

    Next we mark what the previous fiber was (we’ll need to update it to remove the Running & Switching bits), and then call back into the unmanaged host to perform the actual switch.  This goes into the API that we looked at in Article 7 – we simply switch out the current task, and switch in the new one that we passed in.

     

    Finally we call OnFiberSwitchIn.  Again this is another place where we have both a top-half before the switch, and a bottom-half after we’ve switched away and switched back.  When we do the switch in if it’s the 1st time a task has been scheduled we’ll end up back at RealFiberStart.  If this task has already switched out once, we’ll end up at the curFiber.OnFiberSwitchIn line, but we’ll be on a different fiber (we’ll be on curFiber).  In either case we’ll set the state on the previous fiber so that it’ll be schedulable again, and we’ll check the newly switched in fiber to see if it should be aborted.

     

    And that’s how we switch fibers.  You first create a fiber, and then you call SwitchTo on the newly created fiber.  We largely just do some book keeping, and then call directly into the host to do the switch.  It’s that simple. 

     

    That takes us through nearly all the basic mechanisms of the fiber mode host.  In the next article I’m going to discuss how the Fiber class exposes Abort and RudeAbort APIs.  A typical host wouldn’t expose these to managed code, but the CoopFiber sample has it’s reasons for doing so…

     

    [9/16 8:00PM - Fixed grammer]

  • Cooperative Fiber Mode Sample - Day 8

    It seems like a good time to take a breather and look at what we have going on so far. 

     

    First, we’ve started the runtime.  We’ve handed our IHostControl off to the runtime.  The runtime has called back to our IHostControl and gotten a couple of managers.  Those are our task manager and our synchronization manager.  Whenever the runtime needs to create a thread it’s called us back, and we’ve created a brand new thread.  Sure, we’ve converted it into fibers, but we’ve done no real fiber switching. 

     

    Whenever the runtime needed a manual or auto event we’ve given it to it via the synchronization APIs.  When it needed to block on it, it called us, and we blocked on it via IHostAutoEvent or IHostManualEvent.  If another thread alerted the thread blocked on the event, the runtime called us via the IHostTask APIs, and we queued and APC to it. 

     

    If the runtime needed to enter a critical section, it called us, and we blocked or allowed it to enter via the IHostCrst APIs. 

     

    We’ve set up everything necessary to replace the CLR’s threading model, but for the most part we’ve merely delegated to the OS APIs.  Sure, there were minor differences here and there…  But for every task the CLR asks for we create 1 physical thread.  And sure every one of those physical threads has already been converted to a fiber, but we never switch those fibers out. 

     

    Starting with the next article that’ll start to change.

  • Cooperative Fiber Mode Sample - Day 7

    At this point we’ve covered nearly all the major components of the unmanaged host.  The one remaining detail is the interface that’s used between the managed fiber mode implementation and the unmanaged host.  This is all implemented in callbacks.cpp.  We essentially expose an API for all the major operations we allow the fiber API to do.  These include creating fibers, switching fibers in, aborting fibers, and exiting the current fiber.

     

    The creation of a fiber differs from the CreateTask API we covered before in that we aren’t creating a new thread for this fiber.  This fiber is newly created and won’t run until a user switches it in.  Aborting fibers is exactly like aborting a thread, but a fiber will need to be switched in to be properly aborted. 

     

    Switching in fibers is interesting.  Here you’ll find no “SwitchOut” API like the internal unmanaged SwitchOut method on CHostTask.  This is because we cannot run managed code on a “switched out” task.  It has to have some task to run on!  Instead CoopFiber exposes one API for switching in a new task that also switches out the current task.

     

    All of these methods are merely thin wrappers over the functionality we’ve already covered.  Two interesting examples of the several functions include InternalCreateFiber and InternalSwitchIn.

     

    First let’s look at InternalCreateFiber:

     

          __declspec(dllexport) PVOID InternalCreateFiber(PVOID startAddr)

          {

                CHostTask *tmp = CHostTask::CreateTask();

                if (tmp == NULL)

                {

                      _ASSERTE(FALSE && "Out of memory in CreateTask!");

                      return NULL;

                }

     

                tmp->SetManagedStart(reinterpret_cast<LPTHREAD_START_ROUTINE>(startAddr));

               

                tmp->AddRef();

     

                tmp->SetFiberAddress(::CreateFiber(0,(LPFIBER_START_ROUTINE )CHostFiberProc,(PVOID)tmp));

     

                return(tmp);

          }

     

    This function creates a new CHostTask, sets the start address (which is a delegate passed from managed code), and create a new fiber for the task.  We return the task back to managed code which now holds the only reference to it.

     

    The interesting thing to note is the management of the lifetime of the CHostTask.  We have 1 reference to it from the managed code.  As we’ll see later the managed code uses a new Whidbey feature called SafeHandle.  This ensures that when the managed Fiber object is collected our InternalReleaseFiber method will be called.

     

    Next let’s look at InternalSwitchIn:

     

          __declspec(dllexport) void InternalSwitchIn(CHostTask *task)

          {

                CHostTask *curTask = CHostTask::GetCurrentTask();

     

                _ASSERTE(curTask && task);

     

                curTask->SwitchOut();        

                task->SwitchIn();

          }

     

    Here we get the current task, switch it out, and switch in the task passed in.  Again, we have another simple thin wrapper over the APIs we’ve already seen.  All of the other callbacks are thin wrapper as well.

     

    So we’ve now covered nearly all the main principals of the unmanaged code.  Soon we can start to discuss what the host is going to expose to the managed code author!

     

  • Cooperative Fiber Mode Sample - Day 6

    The synchronization primitives are handled off to the runtime by the IHostSyncManager interface.  We’ve already provided this to the runtime through our GetHostManager callback on IHostControl.  Our IHostSyncManager is implemented on our IHostTaskManager and for the most part we just create objects and hand them off.  For example CreateCrst looks like:

     

          *ppCrst = (IHostCrst *)new CHostCriticalSection(this);

          if(NULL == *ppCrst)

          {

                return(E_OUTOFMEMORY);

          }

          (*ppCrst)->AddRef();

     

    And none of them are very different.  Our SetCLRSyncManager does nothing.  A more sophisticated host could use the ICLRSyncManager to help perform deadlock detection.

     

    The CoopFiber implements 4 synchronization primitivies: Auto Events, Manual Events, Critical Sections, and a Semaphore.  All of these with the exception of the critical section are just thin wrappers over the equivalent OS API.  For example looking at CHostManualEvent::Wait:

     

          DWORD result;

          if(option & WAIT_ALERTABLE)

          {

                result = WaitForSingleObjectEx(m_hEvent, dwMilliseconds, true);

          }

          else

          {

                result = WaitForSingleObject(m_hEvent, dwMilliseconds);

          }

     

          switch(result)

          {

                case WAIT_OBJECT_0:

                      return(S_OK);

                case WAIT_ABANDONED:

                      return(HOST_E_ABANDONED);                

                case WAIT_IO_COMPLETION:

                      CHostTask::GetCurrentTask()->FlagSet(TASK_FLAG_ALERTED, false);

                      return(HOST_E_INTERRUPTED);

                case WAIT_TIMEOUT:

                      return(HOST_E_TIMEOUT);

                default:

                      _ASSERTE(!"Shouldn't reach here");

                      return(E_FAIL);        

          }

     

    We can see an implementation that is nearly identical to Join we saw last time.  Both the auto event and semaphore implementations are nearly identical.  More sophisticated hosts could perform deadlock detection below these events (for reader/writer locks and monitors built on top of the events), or they could choose to schedule other fibers on these threads.  For this simple implementation we simply block the current thread.

     

    The critical section is the only synchronization primitive which doesn’t simply wrap the OS API.  This is because the critical section is owned by a specific thread (unlike the events and semaphore).  Because of this the critical section must be aware of the fibers in addition to threads.  This will prevent one fiber from acquiring the critical section, and having another fiber re-acquire on the same thread.

     

    The essence of the critical section acquire lives in TryEnter, which Enter builds on:

     

     

          if(m_holderTask == NULL)

          {

                CritSecHolder cs(&m_critSec);

     

                // no one holds the critical section

                if(m_holderTask == NULL && WaitForSingleObject(m_hEvent, 0) == WAIT_OBJECT_0)

                {

                      // we've acquired the crit section

                      m_holderTask = curTask;

                      m_dwEnterCount = 1;

                      *pbSucceeded = TRUE;

                      return(S_OK);

                }

          }

          else if(m_holderTask == curTask)

          {

                CritSecHolder cs(&m_critSec);

                // we already hold the critical section

                if(m_holderTask == curTask)

                {

                      m_dwEnterCount++;

                      *pbSucceeded = TRUE;

                      return(S_OK);

                }                

          }

     

          return(S_OK);

     

    There’s a couple of points to note.  First, we use a critical section to protect our state.  We don’t want to worry about other threads entering.  Also we use an event to block the task if the critical section cannot be acquired.  A more sophisticated fiber scheduler would want to re-schedule another fiber rather than block the current thread.

     

    Those are the core synchronization primitives that we’re using in CoopFiber.  You can see by limiting the scope of what CoopFiber does we were able to merely rely upon the OS APIs in most circumstances.  Next time I’ll start discussing the managed / unmanaged fiber API interface.

  • Cooperative Fiber Mode Sample - Day 5

    Last time we successfully created a task and started it running.  That’s an accomplishment, but there are a couple of details we should get out of the way before we start to dive deeper into the fiber mode implementation.  Those are all on IHostTask and are Alert and Join.  Once again everything else is trivial or non-important for CoopFiber’s implementation.

     

    Last time we saw a peek at Alert.  This was the m_hCurThread member variable that we set when a fiber was switched in.  Our Alert implementation uses this:

     

          if(FlagCheck(TASK_FLAG_RUNNING))

          {

                QueueUserAPC(&MyAPCProc, m_hCurThread, NULL);

          }

          FlagSet(TASK_FLAG_ALERTED,true);   

     

          return(S_OK);

     

     

    Here we check if the current task is running, and if so we queue an APC to that task.  The queued APC is just a simple do-nothing function to wake the task out of its blocking operation.  Whether the task is running or not we set the alert bit. 

     

    One interesting aspect of this simple sample is that if the user doesn’t schedule the task the alert can never respond.  A more complex host may choose to give priority to alerted tasks or schedule them immediately.

     

    The next API to look at is the Join implementation:

     

    HRESULT __stdcall CHostTask::Join(DWORD dwMilliseconds, DWORD option)

    {

          DWORD result;

     

          if(option & WAIT_ALERTABLE)

          {

                result = WaitForSingleObjectEx(m_hTaskExitedEvent, dwMilliseconds, true);

          }

          else

          {

                result = WaitForSingleObject(m_hTaskExitedEvent, dwMilliseconds);

          }

     

          switch(result)   

          {

                case WAIT_OBJECT_0:

                      return(S_OK);

                case WAIT_TIMEOUT:

                      return(HOST_E_TIMEOUT);

                case WAIT_IO_COMPLETION:

                      return(HOST_E_INTERRUPTED);

          }

     

          _ASSERTE(!"Shouldn't reach here");

          return(E_FAIL);

    }

     

    Here we simply block the current task on an event that is set in our internal CHostTask::ExitTask API.  We’ll do an alertable wait if requested, to which the APC queued in Alert will wake us up.  Finally Join translates the result of WaitForSingleObject* into the appropriate host HRESULT.

     

    We’ve mentioned ExitTask twice now (once in SwitchIn and here again) so now seems like an appropriate time to cover it.  This internal API merely calls the ICLRTask::ExitTask callback, notifying the runtime the task has exited, and updates our internal bookkeeping:

     

        if(m_pCallback!=NULL)

        {

            m_pCallback->ExitTask();   

        }

     

        SetEvent(m_hTaskExitedEvent);

       

        FlagSet(TASK_FLAG_EXITED,true);

        FlagSet(TASK_FLAG_RUNNING,false);

     

    The one interesting call worth noting here is that we set the event that allows our Joined tasks to wake up.  A more complicated host would use Join as an opportunity to deschedule a fiber and would need a more sophisticated mechanism of handling exited tasks.

     

    Well that wraps it up for this edition…  There’s only one more piece of unmanaged code before we start getting into the managed world.  Next time we’ll delve into the synchronization primitives.

  • Cooperative Fiber Mode Sample - Day 4

    Last time we left off with the CLR calling into the host to create a task.  So far everything’s been very simple.  For each interface we’ve only needed to implement a couple of methods to get the bulk of the work done.  While I’ve certainly left out a couple of lines of code here and there, nearly every other API in the interface merely returns S_OK.

     

    Now things start to get really interesting!  The IHostTask interface has just a few functions which are exposed to the runtime.  But there’s a lot of other functionality related to the task embedded in here.

     

    We’ll start off at the natural place to start a task: Start.  It’s pretty simple:

     

          FlagSet(TASK_FLAG_NOTSTARTED,false);

     

          _ASSERTE(m_fiberAddr == NULL);

          m_fiberAddr = ::CreateFiber(0,(LPFIBER_START_ROUTINE )CHostFiberProc,(PVOID)this);

     

          // create the new thread

          ::CreateThread(NULL,NULL,reinterpret_cast<LPTHREAD_START_ROUTINE>(::StartNewThread),this,NULL,NULL);

     

        return(S_OK);

     

     

    We create a fiber for the new task, and we create a new thread.            This new thread starts in StartNewThread, defined in callbacks.cpp.  It looks something like:

     

          CHostTaskManager *myManager = CHostTask::GetManager();

          CHostTask *curTask = static_cast<CHostTask*>(lpParameter);

     

          LPVOID fiberAddr = ConvertThreadToFiber(NULL);

               

          curTask->SwitchIn();

         

          BOOL fResult = ConvertFiberToThread();

     

    Here we simply convert the new thread over to fibers, and then switch in the fiber passed as the argument.  This brings us back to our CHostTask implementation where we need to look at the SwitchIn logic.  This is the most complicated piece of code we’ve encountered yet.

     

           CHostTask *curTask = GetCurrentTask();

           _ASSERTE(curTask != this);

          

           TlsSetValue(CHostTask::CurTaskTlsIndex,this);

           this->AddRef();

     

           // Save our thread handle so we can queue APCs

           if(!DuplicateHandle(GetCurrentProcess(),

    GetCurrentThread(),

    GetCurrentProcess(),

    &m_hCurThread, 0 , FALSE,

    DUPLICATE_SAME_ACCESS))

           {

                  m_hCurThread = INVALID_HANDLE_VALUE;

                  _ASSERTE(!"Duplicate handle failed");

           }

     

           FlagSet(TASK_FLAG_RUNNING, true);

           ::SwitchToFiber(GetFiberAddress());     

     

           // when the fiber switches back we need to switch in our

           // previous task.

           if(curTask->m_pCallback!=NULL)

           {

                  HRESULT hr = curTask->m_pCallback->SwitchIn(GetCurrentThread());

                  _ASSERTE(SUCCEEDED(hr));

           }

          

           if(curTask->FlagCheck(TASK_FLAG_EXITING))

           {

                  _ASSERTE(curTask->GetSwitchingTo());

                  curTask->ExitTask();

                  curTask->GetSwitchingTo()->SwitchIn();

           }

     

           // release the ref for TLS from the previous thread

           CloseHandle(m_hCurThread);

           m_hCurThread = INVALID_HANDLE_VALUE;

           Release();

                 

           SetThreadPriority(GetCurrentThread(), curTask->m_iPriority);

           SetThreadLocale(curTask->m_lcid);

     

     

    What’s going on here?  We have a couple of things to worry about when switching tasks.  We have the task that we’re switching to (this) and we have the task we’re switching from (curTask).

     

    One of the issues we’re concerned with is the lifetime of the task.  While it’s running we don’t want it cleaned up, so we hold a reference to it (in Thread Local Store).  A more complicated host would probably have a pool of tasks rather than the simple TLS mechanism. 

     

    If we need to alert a task we’ll queue an Asynchronous Procedure Call (APC) to it.  Therefore the next thing we do is save the current thread’s handle into this task.  We’ll use this in IHostTask::Alert.  Finally we set the running flag and switch over to the new task.

     

    The interesting thing to note about this method is there are 2 halves to it.  After we call SwitchToFiber we are running on a different fiber on a different stack.  We’ll only return to the bottom half after someone has switched back to “curTask”.  This is the task that we switched away from.  When this happens it’s now the bottom half’s job to tell the runtime that “curTask” is once again running.

     

    It’s possible when we get to the bottom half “curTask” was only switched in to exit.  If so we’ll notify the runtime and immediately re-run the task we’ve been set to re-run.  This is an implementation detail of CoopFiber to allow calling ExitTask on the managed fiber implementation. 

     

    We’re nearly done so we clean up the resources we allocated in the top half for “this”.  It’s no longer running, so we don’t need a reference to it.  Finally we restore the settings for the thread that were stored in the task.  These would have been changed when we did the intital SwitchToFiber which either switched in a task that was at the bottom half of SwitchIn, or the top of CHostFiberProc (in callbacks.cpp).

     

    Wow, so that’s how we start a task!  We create a new thread, that thread gets converted over to fibers, and we switch in the newly created task (which has a fiber already associated with it).  That’ll end up in CHostFiber proc which we’ll cover in a future edition. 

     

    There’s just one more detail to cap off the life time of a task, and that is our internal SwitchOut API.  All it essentially does is set some state and notify the runtime of the switch out:

     

          FlagSet(TASK_FLAG_RUNNING,false);

     

          if(m_pCallback!=NULL)

          {

                HRESULT hr = m_pCallback->SwitchOut();   

          }

     

     

    Next time we’ll go over the remaining APIs on IHostTask that we haven’t covered yet.

More Posts Next page »

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker