Welcome to MSDN Blogs Sign in | Join | Help

(by Sasha Dadiomov) 

 

Isn’t it a common phenomenon that each thing has many faces?  If you were following the transactional memory community for some time, you probably saw it as a pretty theoretical area. There is obviously a lot of science here – discussions about things like “AME calculus” don’t leave much doubt. But there is more to be addressed before transactional memory becomes useful in practice…  Happy Users = Great Theory + Good Implementation + Something More.  This “Something More” is what I will try to explore in this blog.

So, imagine you have a wonderful implementation of Transactional Memory: correct, safe, scalable and quick. Will it be picked up by the world’s developers?  Well, let’s pretend to be one of them.

On day one, you wrote a few samples, played with performance, and became excited. On day two, you started to use it in a new feature of your current hairy project.  I bet your new feature will not work perfectly after your last keystroke from yesterday’s frantic coding session… well, how about debugging it?  Here, a naïve vendor of transactional memory may let you down. First, there may be multiple re-executions. Do you want your debugger to step through them all? And don’t forget that, with optimistic concurrency control, state before validation, which typically happens at the end of the transaction, may be inconsistent. It means you will see values that just don’t make sense. Yes, eventually TM will notice it, roll back the state and re-execute, but you don’t know it yet… you are sitting in front of a debugger, seeing x equal to 1 right after the previous line x=2.  Isn’t it confusing? Even if you were warned about optimistic concurrency control, you may find it hard to reason about your program with this distorted mirror…

Well, this problem can be solved. The debugger may be taught to validate a transaction before stopping at a breakpoint or stepping, so you will not see inconsistent state. But it means that the TM package must include a modified debugger. By the way, there are more things to fix in the debugger. For instance, shadow copy-based TM implies that there will be multiple per-transaction states of the data, so the debugger must show the correct one. Consider debugging a program with two threads: in thread A with no transaction, the value of x is 1; but thread B’s transaction may be setting x to 2. When you view the value of x under the debugger, you want to see 2 when stopped in thread B, but probably 1 when stopped in thread A (showing 2 is problematic, since transaction B haven’t committed yet – what if in the next re-execution B’s transaction would set x to 3? Changes should be invisible outside transaction until commit). Even validating transactions in the debugger is not enough: it additionally has to navigate between global and in-transaction states to show correct values. An alternative solution would be to stop all threads except one while debugging, but this is like looking for a wallet under the lamppost – you don’t want to change the real scenario for the sake of debugging ease. 

OK, the previous paragraph was intended to show that the TM vendor has to deliver a debugging solution with it. The same may apply to a profiler: while a sampling profiler will probably work as usual, the instrumenting one may get fooled by re-executions. And there are new all-important questions now - e.g. which variables cause most of the conflicts? These are your contention points – you probably will change your design to ease the bottleneck. Most of execution time may go into re-execution because of a single variable being modified by several threads concurrently, but how do you find it?

And what about tracing?  Do you want to see traces from all re-executions, or only from the valid ones? Do you want to see traces with inconsistent values, or they will just distract you attention from the real problems when you read the trace? Do you want to see events with inconsistent data?

Well, the picture is clear… changing the fundamental behavior never goes unpunished: the whole ecosystem of tools needs to follow the change. Tooling is the first element of “Something More”.

Now let us look from a totally different perspective: how do atomic blocks co-exist with other facilities used in your program? 

For starters, what about your old traditional transactions (e.g. MSDTC or System.Transactions in Windows)? After all, these were in wide usage before transactional memory inventors went to school. World developers know what transactions are; it would be utterly misleading if TM transactions would be different. It would be also pity if these two kinds of transactions were not usable together. Both strive for atomicity and isolation: everything inside a transaction happens all-or-nothing, and the world is isolated from intermediate states. The difference is domain: TM serves memory, while traditional transactions typically serve database, queues, files, and so on – but usually not memory!  Why? Would it be beneficial for you to have a guarantee that all your operations inside of a transaction are atomic and isolated – database, networking, memory, etc.?

I personally think it could make programs much simpler by eliminating the need to process all possible failure combinations. Imagine a program that executes some algorithm in memory, but also needs to keep a reliable and consistent reflection of the memory elements in a database. Your code will change data in the memory and in the database; you want them to be always equal, e.g. you cannot tolerate non-logged memory changes. To achieve it, you will need to roll back memory changes in case of database failures. But “roll back” smells familiar… isn’t it what TM knows how to do? And aren’t system transactions designed to serve multiple resources?  And why memory cannot be one of them? Well, it seems we will miss a lot of potential benefits if we don’t integrate these technologies. Moreover, TM without integration with system transactions runs a risk of confusing people, who will keep asking “which transaction did you mean”?  TM vendors will probably have to avoid the word “transaction” altogether... keeping in sync with the programmer’s mindset is another part of the “Something More”. Please note though that it would be bad to sacrifice performance of the pure memory scenarios for the benefit of a wider ones; it makes implementation even harder… it should work optimally by itself and “better together” with system transactions.

OK, we started speaking about co-existence with other features. Let’s look now at numerous I/O calls scattered around your program – or more widely, any calls with side effects (e.g. killing other process). What if your newly introduced atomic block includes some of them? Well, naive TM could possibly re-execute them, e.g. repeat your printf multiple times. Even worse – since it is possible now for the program to run on inconsistent (read: random) data, awful things may happen… for instance “if(x)FormatDisk()” may be actually called with random value of x! This is well-known TM problem, and any TM has to take care of it. Usually “care” is a red tape: just forbid it. Very limiting… makes one doubt whether he/she wants to use TM… so next variant is “irrevocability”: whenever a transaction tries its first I/O, it switches from speculative mode to a global lock – no re-executions are possible anymore. The cost is serialization - all other transactions in the system will be required to take the global lock, or be stopped. This approach is viable, but probably should be the last line of defense since it hurts scalability so much. Integration with system transactions opens several better avenues, utilizing the resource manager’s ability to postpone or roll back resource manager actions (e.g. easily defining a custom resource manager around the necessary side-effecting actions). Helpful would be also the ability to just defer some actions for commit time, or register compensations for the rollback case. And any approach will require some checks – which APIs are callable under transaction, and which should be forbidden.

Well, we mentioned a lot of stuff which a shipping TM has to address.  There is more, of course; what I wanted to show was just another face of transactional memory, the “outer” one – its integration with the development and debugging environment and with existing concepts and bodies of code; compatibility with environment is that mysterious “Something More” from the beginning of this blog.

Dana and I have been talking with Charles from channel 9 about transactional memory. Please check out the video. We like to call it “the TM holiday special” :-)

 

Charles has also interviewed our indefatigable QA-engineer-at-large Chris Dern on the novel concurrency testing techniques he and his buddies from MSR are discovering and applying. Check out the video here. I wish I had Chris' job! This is just too much fun...

 

Last but not least we also have a video from our peers working on an actor/agent-based language called Maestro.

 

 

Yours truly, wishing you a happy new year,

 

--Yossi

 

(by Dave Detlefs)

 

For me personally my involvement in the current Microsoft transactional memory project is my second attempt to tilt at the windmill of integrating transactional concepts more tightly into general-purpose programming languages.  In the late 1980's, while working on Ph.D., I was a member of the Avalon/C++ project at Carnegie-Mellon University, working with Professors Jeannette Wing and Maurice Herlihy.  This was an attempt to build support for transactions into C++.  This effort grew alongside CMU's Camelot project, headed by Alfred Spector (now at Google), a C library that provided transactional semantics.  This required the user to insert calls to begin and end transactions, to lock and log memory before updating it, etc -- in short, it provided all the capabilities you need, but it also had all the convenience and safety issues any library-based systems have: you have to go through the trouble of inserting all the right calls, and if you don't, you don't get any checking, just difficult-to-debug errors.  So augmenting a language to insert the appropriate library calls automatically, based on some higher-level specification of atomicity, was obviously attractive. Since the main C++ available at the time was still AT&T's "cfront" C++ compiler, which translated C++ to C, it was quite natural to modify the compiler to augment the translation with the required Camelot library calls.

 

Avalon/C++ was just one example of a small academic cottage industry. The University of Newcastle had a quite similar effort called Arjuna.  But probably the best known of these was the Argus language project done by Barbara Liskov's group at MIT.  (If I may briefly indulge in a bit of abject fandom, I think Professor Liskov is one of the real giants in the history of academic programming language research.  I'm hardly an unbiased observer: she was my original undergraduate advisor, and I did my undergraduate thesis work in a closely-allied group.  I base my statement on the fact that I started to become a mature programmer while programming in the CLU language that she and her team designed and implemented.  If you go look this up, you will be struck by how completely this language anticipated the design of current commercially successful languages, like Java or C# -- and I was able to use it *in 1979*.  Of course, other languages extant at the time had similar ideas, especially Simula, but CLU had (almost) everything: strong typing, garbage collection, data abstraction, even generics and iterators.  This is by no means meant to diminish the achievements of the designers of C# or other languages: the task of the designer of a commercial language is usually to tastefully choose which bold academic ideas from 10-20 years ago were successful.)

 

Argus and Avalon/C++ were different from similar to current transactional memory efforts in many ways, but differed in one very fundamental way.  Transactional memory, as this term is currently understood, takes the database "ACID" properties, and strips this down to A ("atomic") and I ("isolated").  C ("consistent") doesn't really apply -- in the database world, you're allowed to specify declaratively invariants that must hold, and at the end of a transaction, the transaction system checks whether you might have invalidated one of those "consistency invariants."  If so, it doesn't commit the transaction, thus preserving this user-defined notion of "consistency".  Transactional memory doesn't usually include such a notion of consistency (thought the word "consistency" is used in the TM literature for a quite different concept -- whether the current transaction has seen a consistent view of memory when optimistic reading techniques are used).

 

This leaves D = "durability."  In the database world, this property means that if a transaction commits, its effects will survive subsequent system crashes.  That is, if a crash happens, the state after the system recovers will reflect all the committed transactions -- if you're ATM tells you that you've successfully transferred $10,000 into your account, and then there's a city-wide power failure immediately thereafter, you can go home in the dark confident that your money is there.   Current TM systems (including the one we're working on) treat durability as a non-goal -- they're just trying to replace locks as a synchronization mechanism (and, at least in our case, to simplify error-handling code via failure atomicity).

 

The previous group of languages, however, did make durability a goal.  I'll describe the Argus design for this; the Avalon/C++ and Arjuna designs were fairly similar.  Argus, like its predecessor CLU, provides the ability to define abstract data types (i.e., classes).  In addition, it allows the definition of a special form of abstract type called a guardian: a guardian is an abstraction whose state is durable, and will be recovered to a transaction-consistent state after a crash.  Guardians are somewhat like CLR app domains, or the "agents" that communicate in concurrent languages based on message passing (e.g., CSP or Hewitt’s Actors language), in the following way: there is no sharing of objects between guardians -- each guardian encapsulates a set of objects disjoint from all others. Operations on guardians are like remote procedure calls, with any non-primitive arguments (or return values) marshaled by deep copy, to preserve the no-sharing rule.  Guardians live on nodes, i.e., machines, but you're not supposed to care whether a guardian you're communicating with is on the same or a different node -- in either case, its operations are remote procedure calls that execute on the guardian as transactions (and in fact nested transactions if the invoking code is already in a transaction, as when one guardian operation invokes an operation on a second guardian); the caller, as usual with network RPC's, must be prepared for the call to raise a "failure" exception for arbitrary reasons.  But note that now it knows that the system is in a reasonable state, since that failure indicates that the operation's transaction rolled back, restoring invariants.  If the calling code is also in a transaction, it now always has the option of aborting itself and raising a failure exception to its caller.

 

The operations themselves may introduce internal concurrency -- and this concurrency may also be managed by (sub)transactions: Argus already had the concept of "parallel nested transactions."  As with a normal abstract type, the guardian state is a set of instance variables.  Some of these are labeled "stable", and the rest are considered to be "volatile" -- but this doesn't mean what it means in C/C++ or similar languages.  The stable variables determine the guardian state that should survive a crash, and the volatile variables should represent derived state that can be recomputed from the stable state after a crash (like caches or database indices).

 

The type system is modified to define both atomic and non-atomic types.  The basic primitive types are atomic, but aggregate type constructors like arrays or structs now come in two flavors, atomic and not.  The atomic versions handle transaction locking and logging.  Since Argus was an object-oriented system, it's not enough to say that the stable instance variables of a guardian is recovered: we also have to worry about what is reachable from those variables in the pointer graph.  The intention (I forget whether it was checked) is that the transitive closure of what's reachable from the stable roots of a guardian should be atomic types, and that entire object graph is recovered.  The closure reachable (only) from volatile roots may contain cheaper non-atomic types.

 

A user-defined type may use those non-atomic types, but nevertheless declare itself be atomic.  Mechanisms are provided for users to do explicit transaction-level locking and logging.  In many cases this can allow more concurrency, and require less overhead, than if the abstraction were implemented directly with the built-in atomic type constructors.  This facility is quite similar to what the TM research community today discusses under the name "open nested transactions."

 

For all of these great features, none of these languages succeeded in the commercial marketplace.  Of course, they were research projects, not really intended to achieve commercial success, but rather to influence industrial practice at some time in the future. The world of object-oriented databases, once popular but now somewhat eclipsed, could be seen as one of the intellectual progeny of this work.  There are also now several systems that note the similarity between a row in a relational database table and the representation of an object type, and try to allow semi-automatic translation between these views, to allow manipulation of database tables to look more like "normal programming" than construction of SQL commands.  Microsoft's ADO.NET and Java's JDO are examples.  While I understand the reasons for the success of these models, to me, these seem (with all due respect) like somewhat clunkly attempts to regain the marriage of transactions, persistence, and ordinary objects and programming offered by Argus, Avalon, and the like.  But perhaps I'm biased :-)

 

My own Ph.D. thesis dealt with doing concurrent garbage collection in such a system.  The essential problem is that the underlying transaction system dealt with physical addresses of data, but the garbage collector might be moving objects, and thus changing addresses.  The collection may take much longer than the individual transactions in the system, which is why you want to do it concurrently, so you can't treat the collection like a transaction. I was in a race with an MIT grad student, Elliot (now Hillel) Kolodner, working on much the same problem, to finish.  While tense for a while, this eventually turned out fine for all concerned.  Our solutions differed enough not to cause problems.  Hillel, long since an eminent colleague in GC research, and I have agreed that while I got my thesis finished first, his was the better solution.  (I'd advise any Ph.D. student to take my half of that bargain, by the way!)

 

What lessons should the current round of transactional memory implementations draw from this previous work?  Well, we ought to be aware of the work, and let it inform us when it’s appropriate: for example, it would be probably be better if more people understood that there's a history to ideas like parallel or open nesting, that these ideas had relevant precursors in the literature.  On the other hand, having worked on both problems, I'm struck by how much the deletion of one letter from "ACID" changes the landscape.  Ratios of costs change, different issues come to the fore -- in many ways, it's a completely new world.  The engineering of the Argus/Avalon languages had concerns more akin to those of a database system -- how much data you write to stable disk storage, and when, being the overriding concern.  Committing a TM transaction is (or ought to be!) a much lighter-weight operation.  So intuitions formed in the previous world may not applicable in the TM world.  But still, it's nice to have some experience from which to start!

 

 

Welcome again to the TM blog! This time your host is Yossi Levanoni. I’m the dev lead on the TM project.

It was great to see the high-quality questions we’ve gotten in response to Dana’s previous (and our altogether first) post on the blog and I’m looking forward to many more interesting discussions. As Dana said, TM is not at a product development stage yet; it is currently an incubation and as such our dialogue with you—dear reader—is going to be less prescriptive (e.g., “this is how you use API XYZ”) and more a two-way conversation (e.g., “what kind of TM would be useful for you?” What value do you place on feature X?”)  We will also discuss the semantics and implementation techniques of the TM systems we’re incubating (but remember that we are not an open source shop!).

I would like to devote this post to a discussion on TM Programming Models. We would try to see what kind of questions are pertinent in such a discussion by examining a few choices (and alternatives) used by Tim Harris from MSR and his colleagues working on the Bartok STM. A little bit of background: Tim and colleagues have been researching the topic and have published some seminal papers on TM. Some of Tim’s research on TM has been conducted before joining Microsoft. Then, Tim and Simon Peyton-Jones and others have implemented TM for the Haskell programming language. Finally this work culminated in an implementation of TM for the Bartok CLR which is described in this paper. Our group (Parallel Computing Platform) continues to work closely with the MSR folks including Tim, Simon and Martin Abadi, an expert on memory models their formal definition and verification.

Putting the first thing first, we have to define our scope of investigation: What is TM? Is it just an implementation technique for achieving thread-safety? Is it a class of programming models that is independent of particular implementation strategies? Maybe it’s both? What about hardware transactions, are they part of the game? Lock elision? What about concurrency-safety annotations? These things are obviously so interrelated from an end-result perspective that we will discuss them all in this blog, in the same way that Microsoft considers them all “fair game” in order to improve the mechanisms developers have in their toolbox to manage shared state. There are going to be two unifying themes that will encompass most discussions though:

1.       Automating thread-safety concerns (what we want to achieve).

2.       Executing code speculatively or tentatively (what the technology does at its core).

 

While this post confines itself to the STM model of Bartok we will definitely consider the broader picture on this blog, both in terms of programming models, and in terms of implementation.

The Bartok STM Programming Model

Without further ado let’s start considering the Bartok STM programming model. The central idea of this excellent work is that you can write:

atomic { <statements> }

The <statements> within the atomic block would be totally isolated from the effects of other atomic blocks executing concurrently. Great, what does that actually mean and where are the gotchas? Here are a few:

User-Initiated Abort

In the database world, both the programmer and the system have a way to specify that the current work being done against the database (the transaction) should be cancelled and its effects rolled back. The user has the ability to instruct the system to abort the transaction and the system has the prerogative to abort the transaction from under the user. What are the equivalents of these options when automatic isolation for shared memory is considered? Let’s first consider user-initiated abort.

In the Bartok TM programming model the user does have the option to abort a transaction. This is done by letting an exception escape the boundary of the atomic block. As commentators to the previous post have astutely noticed, this feature has some radical implications, not just for concurrency but for single-threaded execution as well. This feature indeed changes the semantics of single-threaded execution in the sense that things will get rolled back (automagically) if the transaction is aborted. This is a very powerful feature that holds a promise to drastically improve the reliability of code under error conditions.

Of course, nothing is for free and providing user-initiated abort comes with a price, both in terms of performance and implementation complexity but also in terms of the complexity of the programming model (which is the focus of this post). Consider this piece of pseudo-code:

            atomic {

           FormatHardDrive ();

     }

 

FormatHardDrive cannot be rolled back once executed, but we have given the user the ability to abort the transaction, which we can’t retract. One option is to simply forbid operations like “FormatHardDrive” inside an atomic block; if the programmer tries to do something like that then the program will either fail to compile or raise an exception at runtime. In this particular case suppose that the implementation does allow any kind of operation inside an atomic block and the user wrote the code as:

 

      atomic {

            FormatHardDrive ();

            throw new Exception();

      }

 

Now we’re really trying to have our cake and eat it, too! FormatHardDrive cannot be rolled back, so presumably the user has the intent of not rolling back ever after executing this operation. On the other hand, the user also deterministically asked for the transaction to roll back. This contradiction between operations that signal to the system that the transaction shouldn’t roll back and explicit user-initiated abort could be solved only in so many ways, each leading to an interesting programming model in its own right. We will explore these options in future blog posts but as a quick tip of the hat I would mention the work that our Intel colleagues have been doing on irrevocable actions. (Digression: we have a little bit of an issue with the term “irrevocable transactions” because the outcome of a transaction, in a traditional transaction system, cannot be decided while the transaction is executing. Anyway, the concept is clear and useful: this thing, which is not a transaction in the usual sense, becomes irrevocable.)

 

System-Initiated Abort

As a somewhat related but separate issue we have the question of whether it is desirable to have a programming model where system “problems” (e.g., inability to commit due to contention) are exposed to the user. If we think of atomic blocks as a sort of lock replacement, then one analogy to the lock world would be the Monitor.TryEnter API. This API allows the user to take action if a lock cannot be acquired within a given deadline.

 

In the Bartok STM programming model there really isn’t a way to “try” executing a transaction---control flow resumes at the point following the atomic block only after the transaction has successfully completed. This completion event is always consistent, no matter whether the transaction has committed or aborted---both events are consistent, which means that all completed transactions in the system can be ordered serially with respect to each other. The research community uses the term serializability for this property.

 

“Purists” would typically object to API’s such as Monitor.TryEnter or TryAtomic on the grounds that it promotes user-level backoff and retry, which should be better left to the system. The “realists” would argue that sometimes there are other things to do, if the lock cannot be taken. The “purists” would argue back then that the application should express the availability of work using scheduling constraints rather than synchronization constraints. This argument also gives rise to thoughts about higher level constructs, such as a construct that lets the system choose between different alternative transactions. This way if the system experiences contention executing one alternative it can choose another alternative and try it out.

As with other considerations, we seek successful precedents in two domains to guide us. These domains are:

1.       The lock-based programming domain

2.       The database transactions domain.

We have discussed the former, what does the latter has to say on this? In the database world there is the assumption that since the system is distributed, something can always go awry and therefore the user must always be ready to deal with failures. This is a big burden for the application developer that is many times dealt with by simply turning around and presenting the error to the end user. One has to wonder whether the same considerations apply in shared memory programming, where we have the working assumption that storage (caches and RAM) and communication paths (data flow through the system) are much more reliable.

 

Different Forms of “abort” and “commit” Statements

In the Bartok model the only way to commit a transaction is by reaching the right curly brace of an atomic block. People often ask whether it makes sense to offer a “commit” statement that commits the work done so far and continue from that point in a separate transaction. The analogy to monitor based programming is the ability to briefly release and reacquire a lock somewhere (dynamically) inside a lock (<obj>) {} statement. This is obviously a dangerous pattern and a source of bugs, but it is also essential for coding coordination patterns with Monitor.Wait and Monitor.PulseAll (the reader is reminded that when Wait is called, the target monitor lock is released and then reacquired after the monitor is pulsed by another thread).

 

Similarly, when considering an explicit “abort”, in the Bartok model a transaction may be aborted only by letting an exception escape beyond the right curly brace of an atomic block. Again we could ask whether we should allow explicit statements to abort a transaction either in addition or instead of abort-by-exception. It is useful to consider the implications of decisions here to call-based composition of code. The Bartok model offers a remarkably strong proposition in terms of composability: a caller is always in control of the work a callee is doing and its inclusion in the outcome of the transaction:

1.       The caller can always discard the changes done by the callee by wrapping the callee in a nested transaction and aborting it if necessary.

2.       The caller is also in full control of exceptional situations. It can always catch exceptions from the callee and if it knows how to handle them, it can decide to still commit.

 

The following code snippet illustrates:

 

void Caller() {

    atomic {

        try {

            Callee();

        }

        catch (MyException e) {

            // Do something alternative here

        }

        If (IDontLikeTheStateOfTheWorldNow()) {

            throw new VetoException()

        }

    }

}

 

The caller sometimes accepts and sometimes rejects success of the callee. The caller sometimes accepts and sometimes rejects failure of the callee. The caller is in control. This allows a great deal of call-based composition.

 

Coordination Constructs.

The atomic block as presented thus far provides isolation and atomicity but it doesn’t offer much help with coordination of concurrent activities. i.e., how would you implement a blocking queue using transactions? Harris et al promote the retry statement as the central way to coordinate threads involved in transactions. The semantics are dead-simple: if you hit a retry statement then the transaction rolls back and re-executes. e.g., consider this example:

 

public class SingleCellQueue<T> : where T : class {

      T m_item;

 

      public void T Get() {

            atomic {

                  T temp = m_item;

                  if (temp == null) retry;

                  m_item = null;

                  return temp;

            }

      }

 

      public void T Put(T item) {

            atomic {

                  if (temp != null) retry;

                  m_item = item;

            }

      }

}

 

Suppose a thread calls Get() and m_item is null. In this case the retry statement would be encountered, the transaction will roll-back and re-execute. At this point I can imagine visible unease in the crowd: is it just going to spin and re-execute, eating up my CPU and killing my battery? Well, hopefully not---an implementation can be infinitely dumb or infinitely smart about how to implement retry. But still, let’s recognize that the developer doesn’t tell the system when would be a good time to pulse waiters, unlike condition variables and monitors. Harris et al also explain in their paper on composable transactions how retry and orElse can be combined together to achieve composability of coordination, which is not possible with explicit condition variables. So on one hand this is a simplification for the developer, but on the other hand it’s costly it for the system to implement.

 

Another property of retry is that it rolls back anything done so far in the transaction, so it is not possible to externalize to the outside world why you’re waiting and what it is that you wish be done by parallel activities, which is a pattern common with condition variables.

 

What are some other approaches to coordination? One possibility is what I would term “abstinence”. According to a majority of people I have sampled blocking is to be discouraged and replaced with various dependency/constrained scheduling schemes for tasks (e.g. the .NET asynchronous programming model). The core question is whether you own the thread or not. If you don’t own the thread, then blocking should be avoided. If you do own the thread, then you need to ask yourself “why am I owning the thread instead of using cooperative task scheduling?” In other words, cooperative task scheduling, with the ability to describe and enforce task dependencies is the “wave of the future”. So maybe this means that providing a blocking construct for atomic blocks is a second-level concern and maybe as such, a solution which is least impactful for the system should be preferred, at least initially.

 

Another option is to define transactional counterparts to condition variables that you explicitly wait and pulse. Condition variables have the semantics of “committing” when you call Wait and thus they bring the question of unconditional commit to the forefront again.

 

Conclusion

In this post we have started skimming the surface of possible programming models for a managed language TM system. There are many other considerations that we will cover in the future. My goal was to provide a brief “insider” look at the factors that we are looking into as a preparation to more concrete questions about the specific choices we are making. Please feel free to follow-up with questions or comments on any of the above topics or others.

If you have been using the Parallel Extension CTP or simply writing multi-threaded code yourself, you probably have run into situations where you needed to share data between multiple threads.   So long as the data is read-only, this isn’t a problem, but what about mutating data?

The easy answer is to use a lock.  There are a lot of blog entries and white papers talking about how to use locks correctly, how to avoid deadlocks, or what are the best locks for the particular scenario, or even how to correctly write lock-free code. You could read all of these and still run into trouble using locks.  You see, the problem isn’t sharing one piece of data; it’s when you are sharing multiple pieces of data – for instance data that has a complex schema involving multiple complex objects such as trees or lists. 

So, locks are a basic tool in your arsenal.  From the simple lock, you can build synchronization mechanisms that hopefully protect your data, correctly, and don’t impact your scalability.

Ah, I hear you sigh.  Yes, hope is eternal, software has bugs and multithreaded software has race conditions, deadlocks, and scalability problems.  Why?  Well, because many find it is hard and fraught with peril to correctly use anything more than a single lock or at most some really small set of course-grained locks.  As code matures, locking hierarchies to provide fine-grained-locking often morph from elegant to clumsy.  You may also find that as your project grows, lock depth blossoms, unnecessarily, or alternatively race conditions are introduced simply because programmers were unaware that it necessary to lock a specific resource.  The end result is code that simply doesn’t scale or your application’s reliability plummets without some of your best and brightest spending time tuning, fixing, “right-sizing” and eliminating locks.  Even after all that work, are you confident that your code is bug-free?  Do race conditions exist in it?  How can you even inspect the code and reason about its correctness?

So, I suspect you are living with some rather course-grained locks or with fear.

Many of my coworkers argue quite persuasively that you should get rid of locks by decomposing the data such that there is no sharing.  They point to the use of “agents” and “messaging” as a good programming paradigm to ensure that data accessed by a specific thread is logically isolated from other threads.  This is a great method but can you re-architect your code to do this?  Can you use it over multiple data structures?  Can you compose them together?

The jury is still out, but I suspect that not all programming patterns in traditional object-oriented languages can be decomposed to a point where there is no sharing.  I find that advocates of the agent model still have concurrency problems!  They have shared data within their agents and they are hampered by either requiring one thread per agent or to use locks or interlocked operations to manage this shared state.

What would you say if I told you that you could reason about your code in a sequential fashion and be relieved of worrying about other code interacting with its data?  Oh, let’s say it’s in a way that scales?   Would that help you?

Well, that is what transactional memory (TM) promises to provide.

Transactional memory is not about “removing locks” but is about abstracting away the requirement to specify a particular lock.  Instead, you can structure your code in well defined sequential blocks of code, what in the database world we call “units of work”, and then let the underlying runtime system, compiler, or hardware provide you the guarantees you desire.  Further, you want this work to scale.  To do that, the underlying system provides concurrency control optimistically.  Instead of always locking a resource, the transactional memory system assumes that there is no contention.  Instead, it detects when these assumptions are incorrect and rolls back changes that were made in the block.  Depending on the implementation, the transactional memory system may then re-execute your block of code.

In this way, transactional memory provides an execution pattern of multithreaded code that can be reasoned about sequentially and when there is very little contention, scales well.  What about when there is contention?  Well, “it depends” will likely be the answer.  How big the transaction is and how often it re-executes (due to contention) changes the performance curves significantly.  Transactional Memory systems usually include some code to manage contention.  That contention manager will significantly impact how code scales under contention.  We would hope that the resulting performance is not worse than fully serialized execution.

So what does this look like?  Various experimental TM systems have been produced.  Some integrate with the compiler, others, like SXM, provide library support.  When I talk about TM and show examples I prefer to use a simple syntax which delineates a block of code as “atomic”.  For instance:

                Atomic {

                                myBigCollection.insert(someStuff);

                                if (condition)

                                                otherCollection.mutate (myBigCollection);

}

 

So, in this example I have two collections and some operation that may optionally work on one depending on the other.  Obviously, my code is demonstrating composability over these two collections.  What might not be so obvious is that I didn’t have to catch any errors here – my code completed successfully or an exception was thrown and nothing was done.  How would I do that with locks now?

 

                Lock (myCollectionLock) {

                                Bool a = false;

                                Bool b = false;

                                try {

                                myBigCollection.insert(someStuff);

                                a = true;

 

                                if (condition) {

                                                otherCollection.mutate (myBigCollection);

                                                b = true;

                                }

                Catch (exception e) {

                                If (a) myBigCollection.remove(someStuff);

                                If (b) myStaticStuff.unmutate(otherCollection, myBigCollection);

// I probably had to write my own unmutate

Throw e;

                }

} // lock

 

Which one do you want to maintain or test?  Oh, did you catch that I had to write my own ‘unmutate’?  Ugh, I hope that is possible.  Also, did you note that I had to keep the lock through the catch?  If I didn’t, there would be a race condition when a failure occurred and the work undone.  Heck, just setting the flags is error prone.  Consider what would have happened if I set “b = true” but forgot to correctly enclose the “if (condition)” clause.

Could your test team find this bug?

                                if (condition)

                                                otherCollection.mutate (myBigCollection);

                                                b = true;

 

This blog is about transactional memory with entries written by the team at Microsoft that is implementing an experimental transactional memory system in .NET.  While we work in the Developer Division’s Parallel Computing Platform product group, we are not working on a product release at this time.  Instead we have been incubating, researching, and experimenting with transactional memory. 

Our goal is to create a TM system that lives up to its promise.  This work has lead us to what we think is some industry-leading work – but we have been so busy doing it that we haven’t had a chance to talk about it; share what we have found; and more importantly hear what the community thinks of this effort. 

This blog is our first step to remedy that.  Welcome to the Transactional Memory blog here at MSDN.

 
Page view tracker