(by Sukhdeep Sodhi)
Now that we’ve released STM.NET into the wild some of you may want to go beyond the simple ‘Hello World’ example* and start writing more sophisticated applications.
There are a lot of excellent research ideas on transactional memory. However, as Sasha likes to say “building a useful product requires great theory, a good implementation, and something more.” His blog-post details some of the challenges that we have had to overcome in building a usable Transactional Memory solution.
In addition to the challenges that Sasha mentions – debugger support, integration with traditional transactions – we also thought long and hard about creating a programming model for STM.NET that would be useful in the real world. An important question that drove our thinking in this space was: how could the programmer writing a non-trivial STM application know which code can be safely invoked inside a transaction? Today’s blog post focuses on why this question is important and how we addressed it (TM programming models is a vast topic, so if you want to get a broader understanding of this area read Yossi’s post on the subject).
Atomic blocks provide isolation and failure atomicity. To provide these properties the runtime instruments the code running inside atomic blocks. The code is instrumented to ensure that any changes that are made inside a transaction can be rolled back in case of a conflict or failure. Verifiable managed code can be instrumented in a straight forward manner by the JIT in the CLR version of STM.NET. However some unsafe code patterns, p/invoke and other forms of native interop are not “visible” to the JIT and thus atomic behavior cannot be automatically inserted in such cases. As a result there is always going to be code that cannot be correctly called inside a transaction. On the converse side, a developer may want some data to be always accessed in a thread safe manner i.e. only inside atomic blocks.
So the question before us was “how should we codify the behavior that some code can be accessed only inside a transaction and some code can be accessed only outside of transactions?” Our solution: to build a set of contracts for STM. Placing a contract on a method would indicate for example whether that field can be invoked inside a transaction. The three primary contracts we defined are:
· AtomicSupported. Indicates that the given method may be invoked correctly both inside and outside of atomic blocks.
· AtomicNotSupported. Indicates that the given method can be invoked correctly only outside of atomic blocks.
· AtomicRequired. Indicates that the given method can be invoked correctly only within atomic blocks.
In this post I will primarily talk about how STM’s contract system applies to methods. However, these contracts can also be applied on fields, accessors, indexers, and delegate types. In addition there are several special cases and interesting implementation challenges that we faced. We may go into those details in later posts.
If you think about these contracts a little, you realize that they are very similar to access modifiers in languages such as C# and C++. Typically access modifiers are keywords in the language. However, since we did not implement compiler support for STM.NET you will not see these contracts as C# keywords. Instead we have implemented them using .NET attributes. So if you want to say that Foo can be called only inside a transaction, Bar can be called only outside a transaction and FooBar can be invoked anywhere you would write them as shown below. The three code snippets below also show illegal invocations (in bold) of methods i.e. invoking them in a manner prohibited by their contracts.
As you can imagine these contracts make it very easy for a library developer who expects his library to be used by programmers using STM.NET to specify which functionality can be safely accessed inside a transaction and which cannot. All that needs to be done is to annotate the public APIs of the library with these contracts. To make the library developer’s job easier we allow specifying a default contract for an assembly. The default contract applies to (almost) all the methods in an assembly. This way the library developer needs to only annotate the methods that deviate from the default contract, instead of annotating each and every public method in the assembly.
Another advantage of using these contracts is that it makes it easy to identify code that invokes a method or accesses a field in an incompatible transactional context. We’ve built a runtime checker that is a part of the CLR execution engine and can be used to identify contract violations, such as AtomicNotSupported code that is accessed inside a transaction and AtomicRequired code that is accessed outside a transaction. You can find more details on the runtime checker in section 6.7 of the STM Programming Guide. We’ve also built a static checking tool to catch these errors, which -- taking a cue from the famous FxCop tool – is called TxCop.
The static checking tool can you help you catch (most) contract violations before running your STM enabled application. If you have downloaded the samples from our download site, you’ll notice that the static checker executes as a post-build step in Visual Studio. This was done to simulate the experience of catching these errors as part of the compilation process.
TxCop -- like all static checkers that do not have the luxury of doing whole program analysis -- has the limitation that it cannot always tell which method will be invoked at a particular call site. For e.g. due to language features such as polymorphism a static checker cannot tell whether the method that will be invoked will be from the base class or a derived class. Similarly at a call site where the method invocation happens through a delegate reference there is no way to statically know which actual method will be invoked. So to work around this limitation we have added rules that govern polymorphism and delegate construction. For instance the rules for polymorphism are based on the principle that a derived method should always be able honor to the contract on the base method. So a contract on a virtual method constrains the contracts that can be placed on the methods overriding it. The examples below illustrate this principle.
When it comes to delegates we know that a delegate encapsulates a reference to a method with a particular set of arguments and return type. This set of arguments and return type is captured in the delegate type’s declaration. If the arguments/ return type in the delegate-type declaration do not match the method that the delegate is trying to encapsulate, then the compiler generates an error. For STM we have added the capability to add contracts on delegate types and similar to the polymorphism rule we’ve added the constraint that the contract on a delegate’s type should always allow it to honor the contract on the method whose reference it encapsulates. A delegate whose type’s contract is AtomicSupported can be invoked inside transactions as well as outside of transactions so it can encapsulate a reference to a method with any contract. However, a delegate whose type’s contract is AtomicRequired can be only invoked inside a transaction so it can only encapsulate references to methods with the AtomicRequired contract.
I hope that reading this blog gave you a good sense of the STM.NET contract system and how it can make your life easier when you write STM aware applications. If you’d like to find out more about STM contracts, static checking rules, or our runtime checker please look at section 7 of the STM Programming Guide. And feel free to reach out to us if you have questions or just to let us know if you think we did something right or wrong.
*Actually, the ‘Hello World’ example is not so straightforward to write in STM.NET. If you’d like to take a stab at writing it, take a look at Section 10.1 in the STM Programming Guide.