Going to PDC this year? We are too!
Niklas Gustafsson, the Principal Architect on Axum (and some other jewels like the Parallel Extensions to the .NET Framework and the Concurrency Runtime) will be giving a talk on Axum in the Innovation Track (here’s a link to the session’s abstract)! The date hasn’t been set yet, but keep your eyes peeled for updates.
Also, while you’re there, make sure to check out some of the other parallel computing sessions:
Leave us a comment and let us know if you’ll be attending this year. :)
Josh Phillips
Program Manager and Axumite
Jebu Ittiachen and Arun Suresh of Yahoo! have written an interesting article for Dr. Dobbs about a prototype they’ve built for Yahoo! Harvester using Erlang. The performance gains they achieved are quite impressive.
Some of the more interesting highlights/discussion points I’ve pulled from the article:
- The language itself is only part of the draw for Erlang users. The OTP framework provides great value to those writing distributed applications. “Since the restart functionality is already found in the standard library, no extra code was required to be written.” What parts of the OTP do you feel are essential? What’s missing?
- It reiterates some of Erlang’s flaws (we’ve heard lots of complaints about syntax). I read this as a major opportunity for us to provide a highly differentiated solution and increase developer productivity (which is what Visual Studio is all about!). “Erlang is not the perfect platform. It has its share of weaknesses and most of it comes from the design goals behind the language.” What else besides syntax and the lack of true string data types gets in your way?
- Performance comparisons. WOW. The performance gains on quad-cores are simply incredible. 100% CPU efficiency and a 3750% increase in MAX capacity per machine. These numbers show major cost savings for our customers. Of course, it all depends on the baseline of which you are comparing against but I think it’s a safe assumption that Yahoo!’s developers are capable of writing fairly scalable, fairly performant code.
- It’s interesting to note that they’re using Erlang “successfully” in 5 properties. We’re seeing a lot more commercial products written entirely in Erlang or with components written in Erlang. Some other noteworthy examples include Amazon, Facebook Chat, and Microsoft’s own PowerSet. This trend suggests that companies are willing to overcome the learning curve and employ once-obscure technologies to achieve performance gains and, ultimately, cost savings in both hardware scale-down and developer productivity.
Josh
Hello Axumites!
It’s been a while since we’ve last talked but many of you have silently been downloading the bits and surely some of you have actually been using them. As we get further along in our prototyping/planning efforts we want to know how you’ve put Axum to work and what it’s done for you.
What have you built? We’re even interested in hearing about your toy apps.
Did Axum benefit you? If so, how?
Tell us what the single most painful part of you experience was.
Did you have an epitomic point where the agents model just made sense?
If you haven’t built anything, what would you want to build? If you have, what do you plan to build next?
Keep experimenting! We have a series of thought-provoking blog posts coming soon.
Josh
Program Manager, Axum
Update: The survey is now closed. Thanks to all that participated!
Axumites,
We’ve heard our customers’ frustrations with asynchronous programming and their call for improved support. We are hoping to better understand why and how you and your customers use asynchronous programming in .NET and how the support we provide for it can be improved in the future.
We invite you to complete a short survey on the Asynchronous Programming Model (APM) as well as on a theoretical language construct called “Asynchronous Methods.” Completing this survey shouldn’t take more than 5-10 minutes of your time, and doing so will help us to better understand where the APM is lacking, how we can make asynchronous programming better and, ultimately, how we can increase your productivity and the scalability and reliability of your code. We encourage you to forward this message on to others who may also have had experience with asynchronous programming.
http://deploy.ztelligence.com/start/index.jsp?PIN=15TMLTB8M3X9C
Thank you!
Josh Phillips | Program Manager II
Are you excited?
We thought so. We’ve been listening to some of your pain points with Axum 0.1.0 and went straight to the kitchen to cook up 0.2.0 - now available for your code-slingin’ pleasure. We’ve fixed quite a few bugs and made some overall improvements. For example we’ve,
- Added an installer for Visual Studio 2010 Beta1
- Enabled parallel execution of functional nodes in dataflow networks
- Made it possible to change fonts and colors of Axum language elements via Tools | Options | Fonts and Colors
- Moved samples to a zip file to make using them easier
- Introduced AxumLite.zip – an Axum command line compiler that doesn't require Visual Studio
- Fixed the compiler error where the channel name was the same as the enclosing namespace name
- Made handling of immutable primitive types more rigorous; fixed some side-effect related bugs
- Added 'using System.Concurrency.Messaging' to the VS-generated template to make classes like OrderedInteractionPoint visible by default
- Added the async method Microsoft.Axum.IO.Console.ReadLine
- Added a spiffy Auction sample (A big shout out to Matthew Podwysocki for his help!)
Remember, Axum’s success and ability to improve your developer toolkit is very much dependent on your feedback. Please try it, abuse it, and be very vocal about what you think on the forum. We love to hear both good and bad things!
Axum 0.2.0 for Visual Studio 2008
Axum 0.2.0 for Visual Studio 2010
Axum Lite
Josh Phillips and the Axum Team
It’s been two weeks since we released the CTP on DevLabs. Thousands of people have downloaded it, tried it out, and many have sent us their feedback – thanks for all your input! In the coming weeks we will be releasing a new build that will contain some bug fixes and a few suggested improvements.
One of the features that gets a lot of attention is asynchronous methods. This is what allows us to have thousands and thousands of concurrently running agents without a significant memory footprint. This is what allows us to build responsive IO-heavy applications. It is also a good way to explain how cooperative multitasking works in Axum.
Let’s start with a sample application that I was writing the other day for some other blog post:
agent MainAgent : channel Microsoft.Axum.Application
{
public MainAgent()
{
var pt = new OrderedInteractionPoint<int>();
// Set up a dataflow network
pt ==> MultiplyByTwo ==> Print;
// Send some numbers to the network
for(int i=0; i<5; i++) pt <-- i;
PrimaryChannel::Done <-- Signal.Value;
}
int MultiplyByTwo(int n)
{
return n*2;
}
void Print(int n)
{
Console.WriteLine(n);
}
}
Before you copy it to Visual Studio, compile and run it – quick, what should be the output of the program?
I was trying to demonstrate that the elements of the dataflow network come out of it in the same order as they were put in. The program was supposed to send five numbers through the pipeline and get back the resulting numbers in the right order.
You might be surprised to discover that the program finishes without printing any numbers. Actually, it’s not hard to see why – the agent sends a message to the Done port before the dataflow network gets around to process the numbers. As soon as the runtime receives a message on the Done port, it terminates the application – sayonara!
If you’re like me, you might be tempted to try to “pause” the agent by putting a call to Console.ReadLine before sending a message to the port Done:
// Send some numbers to the network
for(int i=0; i<5; i++) pt <-- i;
// Not so fast! Wait until the user hits
// return before continuing
Console.WriteLine("Press Enter to continue...");
Console.ReadLine();
PrimaryChannel::Done <-- Signal.Value;
Maybe you already see the problem but I didn’t. When I compiled and ran the program, it dutifully asked me to press Enter, which I did, after which it again terminated without printing any numbers!
That’s where it gets interesting.
Because MultiplyByTwo and Print are methods defined in the agent, they cannot run in parallel with each other or with the agent constructor. If they could, they would be able to modify an agent field, resulting in a data race.
For the two methods to execute, the constructor needs to explicitly suspend its execution and give other methods a chance to execute. In other words, it has to cooperatively yield.
In Axum, all methods have to acquire certain permissions before they can start executing concurrently. A method that can mutate domain’s state needs that domain’s writer token; a method defined in an agent needs that agent’s writer token. A method defined in an agent that is defined in a domain needs both tokens. Read-only methods (called true functions in Axum) require reader tokens for the corresponding domain and/or agent. Only one writer token exists per agent or domain, while multiple reader tokens are available.
This might seem complex but the implementation is actually quite straightforward and cheap. The tokens are stored in the TLS and don’t need to be passed from method to method, and there aren’t that many situations where the tokens need to change ownership.
One such situation is calling an asynchronous method, such as receive. When the receive is called and the message isn’t immediately available, the method pauses and releases its tokens to whoever is the next in line to acquire them.
For an asynchronous method, that pause is implemented as a return to the caller, and all the way to the thread pool if all the methods in the call stack are asynchronous. The rest of the method is then transformed by the compiler to run as a continuation.
By the way, I think that the term “asynchronous” to describe such methods is somewhat unfortunate – perhaps a better term would be “pausable” or “restartable” – because the ability to pause and restart is what these methods are all about. As if this were not confusing enough, we also picked different keywords for the experimental C# – async – and Axum – asynchronous. (We could not use the async keyword in Axum because it was already used for something else)
Both the experimental C# and the Axum compiler recognize the methods defined with the async/asynchronous modifier, as well as the APM methods, and treat them similarly.
Speaking of the APM methods – if you’ve used the .NET StreamReader class before, you must have noticed that it does not have the APM counterparts of the synchronous methods such as Read, ReadLine etc. Reading a text file line by line is not trivial – you have to read blocks of bytes, finding newlines among them and maintaining the current character position in the buffer so that the next invocation could pick up where the last one left off. Try to do it asynchronously and the complexity quickly becomes unmanageable.
The problem is solved trivially using the asynchronous methods – all you need to do is take the implementation of the TextReader, StreamReader, TextWriter and StreamWriter and sprinkle the methods with the async modifier. That’s all it took for us to make the synchronous methods asynchronous! These classes are now available in the Microsoft.Axum.IO namespace.
This is how the asynchronous ReadLine can be implemented using the classes from the Microsoft.Axum.IO namespace:
using Microsoft.Axum.IO;
public isolated static class AsyncConsole
{
public async static string ReadLine()
{
var reader =
new StreamReader(
System.Console.OpenStandardInput());
return reader.ReadLine();
}
}
Now the call site looks like this:
// Send some numbers to the network
for(int i=0; i<5; i++) pt <-- i;
// Not so fast! Wait until the user hits
// return before continuing
Console.WriteLine("Press Enter to continue...");
AsyncConsole.ReadLine();
PrimaryChannel::Done <-- Signal.Value;
Finally, this works correctly and prints out the numbers 0, 2, 4, 6, and 8, as expected.
The next build of the Axum CTP will contain the new Console class – look for it in the Microsoft.Axum.IO namespace.
Artur Laksberg
This article is in response to some great commentary from someone writing under the screen name “sylvan” over on the Channel 9 site:
This is pretty cool, but I think the semantics are overly complicated. I couldn't say that I know of a better way of doing it off hand, but I feel that there *must* be some way of making this simpler. As it stands writing agents still seems to be quite painful and clumsy, and something you would avoid doing up front, and instead do as an afterthought once you realise you need it. I think it's critical that writing agents should be as "light weight" as possible so that people write *all* their code using agents not because they necessarily believe they need them, but because they're the most convenient way of getting stuff done even when running on a single-threaded machine.
For example, there seems to be two main ways of interacting with an agent, either by just passing messages and reading from the channels, or by using request-reply ports if you want to be able to send off multiple requests and then get the reponse back while keeping track of which response belongs to which request. It seems to me that this duplication is unecessary. If you want to send multiple requests couldn't you just be required to use multiple agents, one for each "transaction" (associating a result with a given request is then trivial)? If they need to share state you could use a domain, right? I've only briefly looked at it but it does seem that the request-reply ports just complicate things and aren't actually necessary.
Also, I think first-class tuples will be very important for this, as you tend to want to make quick ad-hoc groupings of data all the time when sending and receiving messages.
The semantics and syntax of this needs to be simplified a lot to make it easier to use, it still seems that you spend far too much time and screen real-estate dealing with the details of coordination, rather than your algorithm.
There are several really important things to talk about in here, and I’ll try to get to them all.
First, let me address the easy one: tuples. Yes, you’re right, and I wish we had just gone ahead and added them from the start. F# has them, and so should Axum. Same thing with a unit type – we have added ‘Signal’ as a poor man’s unit type (no literal support), but we haven’t made it interoperate with ‘void.’ This is absolutely something we would like to fix.
On to the deeper and more subjective issues! What sylvan touches on are some fundamental design choices we made for the language, so let me elaborate on it and then you can all chime in on whether they were good choices or not.
Agents vs. Request / Reply Ports
Sylvan is absolutely right that one could create a new agent instance for correlation purposes. There are two main reasons for not always doing so and relying on request/reply ports for correlated replies:
First, you may actually want to fit the use of the port(s) into the overall protocol of the channel. This is only relevant if you have added states to the channel. There is no way to express protocols across channels (doing so may seem desirable until you consider how complex they would be to reason about), so if the use of correlated requests and replies needs to be incorporated into a protocol, this is the way to do it.
Second, there’s the issue of performance. While I would like it if agents were as cheap to use as classes, it is not the case, and we are not trying to hide it in the language. One of our core design principles is not to pretend that there’s a cost to the higher-level concepts the language introduces. Models, such as RPC, that make messaging look like method calls don’t call out the places where overhead is significantly higher than the code suggests, and I think that is bad.
There are other actor-oriented languages (I’m naming no names here) where the common perception is that messaging is so cheap that you can use agents instead of classes, but that perception is far separated from reality.
Thus, we made creating agents look different from creating classes, and we made message-passing explicit and “in your face” with operators that stand out in your code. We make you opt in to asynchronous methods, because you really don’t want to use them unless your code will block (do a receive).
I don’t actually believe that you should create agents all the time – it should be a conscious choice to deviate from object-oriented concepts. We want to strike a balance between shared-memory and message-passing in this language, and we are not trying to replace the object-oriented paradigm – within an agent, OO rules!.
Details of Coordination
Regarding sylvan’s last comment, I think I know what is meant. For example, why do I have to do all this just to respond to messages:
while (true)
{
var x = receive ( PrimaryChannel::Port1 );
doSomething ( x );
}
When all I really want is that all messages from Port1 go to ‘doSomething’?
We did this because one thing we wanted to make easier was writing very stateful agents, something that is typically quite challenging with the usual callback-based solutions: you wind up with a tangle of ad hoc state-machine goo. Our observation is that old-fashioned program counters and compiled structured control-flow is great for managing complex program state (I’m not talking of the kind of state you store in variables, but the state of the algorithm’s progress).
Thus, we have built rich support for control-flow based messaging, as described in the documentation that is now available via Dev Labs.
Let me then come back to sylvan’s issue – spending too much screen real estate on the mechanics of messaging. An early prototype of the language had only the control-flow-based messaging, and this was, as pointed out, problematic. Even if we don’t have to build completely stateless agents because we’re not distributing all parts of the application, it will still be the case that many agents will be mostly stateless and that even stateful agents can handle some of their messages using patterns typical of statelessness.
This was in fact how we came to introduce the data-flow concepts into the language: the desire to just forward messages to a method lead to a generalization where you can build pipelines of methods that messages are passed through and possibly out again. The simplest network, which corresponds to hooking a callback to a port is the forward operator:
PrimaryChannel::Port ==> doSomething;
We think the generalization of forwarding messages into the network concept is valuable because it allows for another form of parallelism through pipelining: each stage in a pipeline can run in parallel, subject to the same reader / writer rules that other agent and domain code is subject to, but more fine-grained (much less costly than creating a new channel for a new agent).
Networks also allow us to forward not just to methods, but to buffers of various kinds, such as queues and single-assignment variables.
Setting all this up in the agent constructor is really easy and something we find ourselves doing all the time. However, it is still very programmatic. We have considered a more declarative approach, something similar to VB’s ‘Handles’ syntax, which would be useful for the most common network: forwarding from a port to a method. It would be interesting to hear your thoughts on this.
Channels
Why do we require you to define channels? This also wastes screen real estate and is undoubtedly cumbersome. Why not just deal directly with agents? The reason is that tightly coupled component models invariable lead to brittle programs that do not easily allow themselves to be distributed nor partially re-implemented without breaking a lot of working code. By taking a hard line on loose coupling, we are hoping to establish that Axum is not about cutting corners: safe parallelism will require a level of formalism and rigor between components that hasn’t been common to date.
We believe that you either pay the price by doing more stuff upfront when designing your components and their interfaces, or later when you are trying to debug your already deployed application on a client-owned server.
That said, there could be much better ways of accomplishing this than what we have come up with, so don’t take the above as a dismissal of the concern. On the contrary, I share sylvan’s interest in making it much easier, I just don’t know how to do so (yet) without compromising what I consider some pretty critical aspects of the language model.
Thanks,
Niklas Gustafsson
It gives me great pleasure to announce the availability of our Axum technology preview at the MSDN DevLabs site. I have been working on this project off and on for several years, and for the last 18 months or so, I’ve had good company from Artur Laksberg and Josh Phillips. It has been a long and winding road to this point, but we are here now and that is incredibly exciting, I think.
Download it, take it for a spin, and let us know what you think. Be sure to check out Josh’s introductory video on Channel 9. If the weather where you are is anything like it is here in Seattle right now, there's really no reason to do anything besides building Axum apps this upcoming weekend! :-)
It needs to be reiterated that Axum is an incubation effort, which means that we’re not committed to shipping it in any particular product release or in the form offered by this preview. A lot will depend on your response and involvement with us.
This language is not a finished product – we are quite certain that it is too big and we have some ideas on what to take out, but would like to hear from you about it. We welcome suggestions on syntax, but we are more concerned about getting the semantics right first – only where syntax stands in the way of comprehension is it really a big deal for the moment.
Here are some (leading :-)) questions to ponder and help us understand:
1. Is the number of concepts you have to learn in order to use the language (I count four: agents, channels, send, receive) too many? Is this a complex language? There will always be a learning curve with any new technology, but looking past that…
2. Is overcoming that learning curve worth it if what you get is a solution offering safe parallelism?
3. Are data-flow networks as general as what Axum offers useful, or would a more constrained form be sufficient? If generality is indeed valuable, are we missing any constructs that would make networks even more useful?
4. What aspects of the language aren’t we explaining properly? What didn’t you get after trying it out?
5. What kinds of projects is Axum best suited for, what are its sweet-spots in your opinion? What kinds of projects is it less ideally useful for?
6. How feasible is the idea of a special-purpose language in the first place? Would you see yourself developing parts of your application in Axum, parts in C#, VB, or F#?
7. One of the innovations in Axum is the availability of asynchronous methods, which offer great scalability. Tell us how you are using them and what your experiences with them are. Let us know about your 10,000-agent application!
Please direct all comments on issues that you may run into to the Axum MSDN Forum and make sure to actually read the README, it contains crucial information that will make things easier for you.
On behalf of the Axum team,
Niklas Gustafsson
Software Architect
Developer Division
The subject of immutability sparks intense interest among the people who follow our blog, as is evident from the comments to the recent post by Niklas. Naturally, inside our team, a lot of thinking goes into making parallelism safe, and the ideas about restricted types play a major role in that. The work in this area was spearheaded by Joe Duffy, and I highly recommend watching this Channel 9 video where he gives an overview of the motivation behind his thinking.
In this post, I want to give the readers some insight into why we’re so interested in immutability, and specifically how it applies to Axum as a language.
Recall our earlier discussion on isolation and reader/writer agents. An example from the post went like this:
domain D
{
private int data1;
private string data2;
reader agent A : channel C
{
public A()
{
int n = parent.data1;
string s = parent.data2;
}
}
}
The agent is declared a reader, and the code in the constructor reads the domain state. All is well.
Following the same logic, I want to read a field of type Employee, so I add a new domain field employeeOfTheMonth and put the following in the agent’s constructor:
Employee employee = parent.employeeOfTheMonth; // 1
If this doesn’t seem suspect yet, let me continue. Remember, we’re still in the reader agent:
employee.Salary += 10000; // 2
Whoa! We’ve just modified a domain field from a reader agent – doing exactly what reader agents are not supposed to do. Clearly, we want the Axum compiler to flag these kinds of problems, and for that, either line #1 or line #2 should produce a compile-time error. At the same time, the example above must compile without errors.
Fortunately, line #1 does produce an error. The assignment fails, because the type of the right-hand-side expression is not Employee but a read-only Employee. The idea of a read-only reference is similar to const in C++ in that the read-only data cannot be changed, and a read-only reference cannot be converted to a “regular” reference. For an agent that is a reader, the parent reference is typed as a read-only reference, and that is why the fields reachable through it cannot be mutated.
The syntax of the read-only reference is immaterial at the moment, and fortunately most of the time you don’t need to use it, because you can use type inference – the var keyword:
var employee = parent.employeeOfTheMonth; // OK!
This gets you past the error. However, now the second line has to fail, and it does:
employee.Salary += 10000; // Error: cannot modify a
// field via a read-only reference
Now what about int and string from the example above?
For ints, and types that behave like ints – types that exhibit value semantics – the read-only modifier simply doesn’t apply. For immutable types such as string, read-only modifier isn’t needed because it would be overkill – since the type itself is known to be immutable, the ability to declare read-only expressions of such type is superfluous – there simply isn’t any other kind!
This is why a reader agent can, in fact, safely read an int and a string – because there is no risk that it will mutate any objects reachable from them. (Of course, a reader agent still cannot mutate the object itself – a statement like “parent.data1 = 10;” would trigger a compiler error)
While it is clear that the idea of separating read-only and read-write access to data is fundamental in Axum, this remains an area of active research and experimentation. Being a new – and so far experimental – language, we’re willing to entertain some far-reaching ideas. For example, some members of the team urge us to take a step towards the world of functional programming and make read-only the default.
If we were to take that step, the assignment at line #1 above would work as originally written, and you would have to use a special syntax to create a writeable reference. This opens up some opportunities for us, but also makes it harder for people with C# background to adopt Axum. This is one of the areas where feedback from readers would be incredibly valuable.
To give you an example of such an opportunity, consider the following snippet of code from a reader agent:
Employee employee = GetEmployeeFromSomewhere();
target <-- employee;
Now if the target were a part of the dataflow network in the same domain, we would want to send the data by reference (without deeply cloning it), and have that dataflow network start processing the data immediately. This would only be safe if could guarantee that the agent that sent the data won’t be mutating that data while the dataflow network is reading it.
The employee being read-only would provide such a guarantee. Not having to say it in the code would make the guarantee, and the parallelism enabled by it, the default.
Artur Laksberg
We rarely speak of parallelism when discussing Axum, because you very rarely think about it when designing or writing an Axum application. Instead, you write a bunch of components which are all single-threaded and have them send messages to each other asynchronously; efficient use of parallel hardware comes from running many Axum agents simultaneously.
In most .NET-based scenarios, asynchronous operations are relatively scarce. When you need to use one of the various async features, it can have a huge impact on the structure and design of your application, but many manage to avoid having to do it at all.
Clearly, with Axum, this is not so. Interactions between agents, whether within a domain or in different domains, are always asynchronous. Thus, we have to go beyond what is available elsewhere in the platform and provide actual language support for asynchronous programming.
We are taking two approaches to handling asynchrony: a) data-flow networks, discussed in an earlier post by Artur Laksberg, and b) control-flow-based constructs. Here, I will cover the latter.
Receive
In Axum, asynchrony centers around ‘receive,’ which collects a message from a single source, such as a channel port. Logically, receive suspends execution of the agent or network that is executing until data is available from the source.
int x = receive ( PrimaryChannel::InputMessage );
A naive implementation of receive would block the actual thread that is in use when the operator is reached. We have this in place for situations when that is really what you want, but it places a ceiling on the scalability of the application. For example, using the standard stack size of 1MB for each thread, you will quickly run out of address space (especially on 32-bit systems) when creating lots of agents. Threads are expensive to create, delete, and hold on to, and relying on a thread pool doesn’t help one bit when you block the thread with a synchronous call.
A better approach is to perform compiler transformations to the code similar to what the C# compiler does for iterators, which moves local variable to the heap and enables us to give up the thread while waiting for the data to come in. This makes agents much more light-weight and allows us to have thousands and thousands of agents running concurrently in a single process. I’ve had 500,000 blocked agents on my 2GB laptop without bringing it to its knees.
That kind of scale opens the model for use in entirely new categories of algorithms than what it can be used for when you can only have a couple hundred or a few thousand agents.
Alas, the compiler transformations are expensive when there are no receives in the code, the overhead is still there. Especially for leaf methods (those not calling other Axum methods), it is common to not do any message-passing, in which case the transformations are a waste of runtime performance.
Therefore, the default is that methods are not doing these transformations; we call these methods synchronous. The only category of methods that are asynchronous by default are the agent constructors, since they are never leaves. The default for other methods is overridden by declaring the method ‘asynchronous’:
asynchronous protected int foo() { ... receive (x); ...}
The compiler takes care of everything else – you do not have to do anything beyond adding that one keyword to the declaration. Asynchronous methods calling other asynchronous methods will do so via compiler transformations that give up the thread while waiting for results. In fact, receive is implemented in the runtime as an asynchronous method.
Asynchronous Programming Model
Those who have tried to do asynchronous programming in .NET will be familiar with the Asynchronous Programming Model, APM. It builds on an operation XXX being encoded in two methods, BeginXXX and EndXXX: the former starts the operation, the latter collects the results (including any exceptions). The pattern is very powerful and flexible, so much so that it is sometimes difficult to program against or implement.
Even though ‘receive’ is the real foundation of asynchrony in Axum, we chose to support the APM in the same way, by treating any APM operation as an asynchronous operation. We did this to make it easier to build applications that mix I/O operations and message-passing, a major scenario for realizing concurrency with the Axum model.
Thus, to call System.IO.Stream.Read() using the APM, you don’t have to do anything, as long as you’re invoking it from an asynchronous method. Your code will look completely synchronous, but the compiler will make sure that it is not holding onto the thread while waiting.
asynchronous int ReadFile(string path)
{
Stream stream = …;
int numRead = 0;
// This is where things are asynchronous.
while ( (numRead = stream.Read(buffer, ...) ) > 0 )
{
PrimaryChannel::NextBufferRead <-- buffer;
}
return numRead;
}
Interleave
Once you have the tools you need to productively work with individual asynchronous operations, how do we realize the full potential of parallel hardware? One way, as stated earlier, is to run many agents and pass messages between them. That’s why we have the support for asynchronous messages in the first place. Another is to try to do I/O in parallel. The parallelism you wind up with using Axum is different from the structured, very regular patterns of parallel for and such constructs; ideally, you will find ways of using both together.
Let start by considering the scenario of having more than one outstanding I/O operation. Doing I/O concurrently is very efficient on Windows because it allows the operating system to balance the amount of work based on how many processors there are. Also, you can potentially do some preliminary work in your code while the hardware devices are operating independently.
Building on the earlier example, to perform two simultaneous file read operations, we can just write this piece of code:
asynchronous int ReadTwoFiles(string pthA, string pthB)
{
int numRead = 0;
interleave
{
numRead += ReadFile(pthA);
numRead += ReadFile(pthB);
}
return numRead;
}
What happens here is that the two statements under ‘interleave’ are coordinated so that multiple reads may be outstanding at any one point in time, but it does not introduce parallelism in the code itself: what is running concurrently are the I/O operations, not the user code.
In this specific case, the code will start reading from file A, and as soon as a read operation doesn’t immediately complete, there is an opportunity to start reading from file B. When that pauses, which means that the device is working on our behalf, the read from file A may complete and the code can go on to reading the next chunk.
If it hasn’t yet completed, the code waits for either operation to finish first, and whichever comes in first is resumed. Unlike the code in ReadFile, where the read operations are asynchronous but strictly ordered, this is unordered asynchrony. However, the interleave block is, as a block, ordered with respect to operations outside it:
asynchronous int ReadThreeFiles(string pthA,
string pthB,
string pthC)
{
int numRead = 0;
interleave
{
numRead += ReadFile(pthA);
numRead += ReadFile(pthB);
}
numRead += ReadFile(pthC);
return numRead;
}
In the preceding example, either A or B may finish first, but reading from C isn’t started until both A and B have finished. Regardless, max one thread of execution is active an any point in time and we have have thousands and thousands of agents like this in a process consuming a minimum of threads from the thread pool.
This keeps the system busy without introducing data races in our code: updating ‘numRead’ in each of the branches of the interleave is perfectly safe, because the two statements, while technically concurrent, will never be executing code in parallel: “concurrent waiting, serial execution.”
There can be any number of statements under the interleave, and they may be any kind of statement, including block statements (which are probably the most useful to have there). One limitation, though, is that the number of statements has to be known at compile time.
Sometimes, though, you really want the number of statements to be determined at runtime. We’ll look at that in the context of another scenario: getting agents to do work in parallel with each other.
As agent A, I can send a message to another agent B, do a little bit of work before calling receive and waiting for a response. If A and B perform work at the same, we have some concurrency in our application. At some point, though, A probably needs to hear back from B in order to proceed, and they will then stop working in parallel. Likewise, if B finishes its work before A hits the receive, the two will not be working in parallel. Sooner or later, only one thing is running.
Either way, unless B also contacts C, which contacts D, etc., the maximum increase in efficiency is 100% for that period of time. What can we do about that? As Gustafson’s Law (i.e. John Gustafson, no relation), essentially points out, just do more! We can, for example, try to find a number of operations B, C, D, E and F, which all are independent of each other and run them in parallel with A, all controlled from A.
For example:
asynchronous void a_method()
{
foreach (var chan in {b, c, d, e, f})
{
chan::RequestPort <-- new Request(...);
// Do something that takes a little while
var result = receive(chan::ReplyPort);
// Do some more processing with the result.
}
}
The only thing is, that won’t work as we hoped – it is serial. A will work concurrently first with B, then C, then D, then E, then F. Again, the maximum increase in efficiency is 100%, just sustained for a longer period of time. This lengthening is a good thing, don’t get me wrong, as efficiency measured over time is increased. However, we don’t use that efficiency to improve our time-to-completion.
To do so, we need something else, a replacement for foreach.
Now, let us get back to the replacement for ‘foreach,' which looks like this, very similar to the interleave statement in the text earlier:
asynchronous void a_method()
{
interleave (var chan in {b, c, d, e, f})
{
chan::RequestPort <-- new Request(...);
// Do something that takes a little while
var result = receive(chan::ReplyPort);
// Do some more processing with the result.
}
}
What this will do is interact with B, C, D, E and F in an unordered fashion. That is, the five other agents will all be able to run in parallel, but we are not introducing any parallelism into A, as only one “fork” at a time will be executing the code in A. What happens is that the first iteration starts, sends its message, does some work, then blocks at the receive expression. At that point, the second iteration may start, do the same thing, and so on. This allows some overlap between the five agents we’re talking to.
Of course, if the work between the send and the receive is significant relative to the work going on in the other agents, there’s little overlap between them, since the other agent finishes before or right after A reaches the receive. This means that the second message isn’t sent until B has already finished, which takes us back to where we started.
The code can be fixed up to account for this by inserting a call to ‘wait(0)’:
asynchronous void a_method()
{
interleave (var chan in {b, c, d, e, f})
{
chan::RequestPort <-- new Request(...);
wait(0);
// Do something that takes a little while
var result = receive(chan::ReplyPort);
// Do some more processing with the result.
}
}
This will have the effect of yielding to the second branch, which yields to the third, etc. When all have started and blocked, the first restarts, etc. This increases the overlap between the other agents that we are orchestrating from A. wait(), like receive, is an asynchronous method. wait() should always be used instead of Thread.Sleep() in Axum. Thread.Sleep will block the thread invoking it, which is the last thing we want.
The two forms of interleave may be used not only with receive, but also with the APM pattern and any asynchronous method (which will, eventually, use either receive or an APM operation). A more interesting example of using it to orchestrate I/O than what we saw earlier would be a web crawler, which spawns off a separate, asynchronous, line of execution for every web site it is crawling. Since operation latency is likely to be high, using an interleave to gather data in parallel rather than serially really makes sense.
Niklas Gustafsson
Axumite
Axum has a couple of storage modifiers not found in most other languages (although several languages incorporate futures and / or promises, similar concepts).
Axum has three modifiers that are useful mostly with concurrent algorithms, for synchronization between agents within a domain or between interleave branches within an agent. For inter-domain synchronization, channels remain the only method.
Here’s an example:
async int x;
const string str;
sync object obj;
These storage modifiers can be used for locals, parameters, domain and agent fields, etc. (but not as schema properties).
Such variable all have in common that they can be empty or full at any given point in time. Being empty is different from a nullable type having the value null, it literally means that the variable does not hold a value of any kind and therefore cannot be read. Readers will block when trying to get the value out of an empty location (of course, the compiler does the right thing to make it scalable).
As a category, we call this “empty full storage.” Readers of an empty storage location always block and writers can always write to an empty location. What is more interesting is what happens when the variable is full. We have the following possibilities:
Writers may…
- … block until the location is empty.
- … throw an exception.
- … overwrite the value already stored at the location.
Readers may…
- … get a copy of the data and leave the location full.
- … empty the location after reading it.
One can imagine having support for these operations on a single storage component, but we have chosen to define different components for three out of the six combinations that are possible.
They are:
- async, where writers overwrite and readers copy
- sync, where writers block and readers empty
- const, where writers throw and readers copy
Async is roughly the normal variable storage semantics, except that it can be empty if not initialized; sync is roughly a synchronous producer / consumer buffer, and const is a single-assignment variable. One neat thing is that ‘const’ is a generalization of a C# const variable; with the exception of the possibility to deadlock, it behaves just like a const in that each read operation that completes will return the same data.
If you return a const from a method, you are essentially giving the caller a handle to a future value. Const variables are not as flexible as full-fledged promises, though.
All three are useful for communicating between agents, while async and const are also useful when used in dataflow networks, as they are sources and targets. You can imagine a network which processes an incoming message and places the result in a const variable, which is then read by some control-flow-based code that blocks until the value is available.
When used in networks, both async and const act as broadcasters; the only difference is that const variables will only broadcast one single message to its targets.
Three Questions
Empty / full storage is one of the more controversial aspects of Axum among the members of the internal team. There are those who think syntactic support in the language is useful and necessary, and there are those who would prefer to just see runtime classes implement these concepts. Which is right? I suspect it’s hard to answer that until you try it out for real, but it’s worth thinking about a bit.
Also, are the three above the right set? Artur Laksberg posted something on throttling earlier; his first example, where messages are dropped by the consumer could be reduced to a one-line modification of the consumer agent code if we had something where writers overwrite and readers empty. Are there other things we’re missing? throw/empty seems like a recipe for introducing races in code, and block/copy seems, well, silly…
Third, there are those on the team who would argue that ‘async’ is a bad moniker for its particular semantics (overwrite/copy). I would disagree, but there is another language feature that we all agree it would be better for, so we want to find another keyword for the storage modifier, and I’m therefore asking for your help with that. It has to be short (4-5 characters) and a good mnemonic for the semantics of the storage. Any and all suggestions will be considered…
Niklas Gustafsson
Axumite
Note: a variant of this text also appears in the Axum Programmer’s Guide, which will be distributed with the upcoming CTP.
Axum has great support for writing distributed applications. In fact, one of the reasons for taking such a hard line on isolation is so that domains can interact locally or remotely with no change in the model. By borrowing a page from how the Web is programmed and making it scale to the small, we can easily go back to its roots and interact across networks.
With domains being services of a SOA application, agents the protocol handlers, and schema the payload definitions (btw, schema are XML-schema compliant), we have an easy time mapping Axum to web services.
In the runtime that will be in the CTP, we have support for local, in-process channels as well as WCF-based channels.
To reach an agent within a domain, you have to give it an address; this is true in local and remote scenarios alike. Within a process, it’s a bit easier, because the agent type name itself acts as a “default” address if nothing else, but in the distributed scenario, we have to do a bit more. But it’s just a little bit.
The Axum runtime does this through an interface called IHost, which allows you to give the agent an address within a domain. To be precise, what we associate with an address is a factory for agents of the hosted type, which is used to create agent instances when someone creates a new connection. Each underlying communication / service hosting framework has to have its own implementation of IHost; Axum comes with one for WCF and one for in-process communication.
The address may be associated with an existing domain instance, in which case created agent instances are associated with that domain, or it may be associated with no domain instance, in which case created agent instances are associated with a new domain instance, one for each new connection.
For example, if you are building an Axum-based server, you can host domains as services with the following code:
channel Simple
{
input string Msg1;
output string Msg2;
}
domain ServiceDomain
{
agent ServiceAgent : channel Simple
{
public ServiceAgent ()
{
// Do something useful.
}
}
}
agent Server : channel Microsoft.Axum.Application
{
public Server ()
{
var hst = new WcfServiceHost(
new NetTcpBinding(
SecurityMode.None, false));
hst.Host<ServiceDomain.ServiceAgent>(
"net.tcp://localhost/Service1");
}
}
Each time some client connects to the address "net.tcp://localhost/Service1,” a new instance of ‘ServiceAgent’ will be created, associated with a brand new ServiceDomain instance. If instead, we wanted created agents to be associated with a single domain instance, we have to pass one in to ‘Host’:
hst.Host<ServiceDomain.ServiceAgent>(
"net.tcp:...",
new ServiceDomain());
There is a corresponding interface for the client side, called ICommunicationProvider. This is used to create a new connection to an Axum service (or any service, for that matter, we have no knowledge that it’s written in Axum, a consequence of loose coupling). It, too, must have a version for each underlying communication framework and the Axum runtime comes with one for WCF and one for in-process communication.
Connecting to the service above would look like this:
var prov = new WcfCommunicationProvider(
new NetTcpBinding(SecurityMode.None, false));
var chan = prov.Connect<Simple>("net.tcp://localhost/Service1");
Of course, you don’t have to create a new communication provider for each connection, or a new host for each Host call.
That’s pretty much it – you just make sure that you choose the right WCF binding, and it’s off to the races with WCF doing all the hard work for us. If you are using schema types to define your channel payloads, they are already DataContract-compliant and safe to use for both inter- and intra-process communication.
As it turns out, this is no different from how you program Axum within a process boundary: the only different is what concrete IHost/ICommunicationProvider implementations you use, and what the addresses you create for your agents look like. In other words, the programming model for distributed and local concurrency in Axum applications is identical.
Anyway, that’s the short intro, but there’s not much else to it. The CTP won’t have any non-programmatic means of defining bindings, creating connections, or hosting services, but we would love to hear of your experiments with such things. Axum is a language, so we’re not really trying to solve things like that.
Niklas Gustafsson
Axumite
One of the problems in asynchronous message passing is preventing a situation where a sender produces messages faster than a receiver can handle them. Consider this piece of Axum code:
foreach(var price in GetNextStockPrice())
{
report::Price <-- price;
}
If the stock quotes are produced faster than the agent behind the report channel can handle them, an overflow is bound to occur.
While this problem sounds like a fundamental flaw in asynchronous message passing due its “fire and forget” nature, in the real life things are not nearly so bad. In many cases you know from your application logic that the incoming rate of messages is such that in all likelihood they can comfortably be handled by the receiver. In a GUI application, when a message is triggered by the user pressing a button, and the receiver can handle at least a thousand of such messages per second, it is reasonable to assume that the receiver will always be faster than the sender. (That is, until sometime later someone adds a call to some blocking API in the receiver, invalidating the above assumption)
Having established that the rate of incoming message rate is a potential problem, can we still get away without solving it? What if we could tolerate a loss of some messages and simply refuse to handle them if we’re too busy?
For example, if the code above were a part of a service that implements a stock dashboard, it might be fine to skip a few updates. If we didn’t get the latest quote now, we’ll get it in a few minutes with the new update. Not a perfect solution for a day trader, but could be OK for a Windows gadget.
On way to implement it would be to put a proxy agent between the sender and the receiver who is doing the actual work.
agent StockAgentProxy : channel StockTicker
{
public StockAgentProxy()
{
while(true)
{
var price =
receive(PrimaryChannel::Price);
if( WorkerAgentIsBusy() )
{
// Too busy to handle the message, just drop it
}
else
{
// Forward the message to the actual worker:
worker::Price <-- price;
}
}
}
}
The proxy agent runs in parallel with the worker-agent, so that if the worker is busy processing data, the proxy can still consume messages coming from the sender and drop them if necessary.
Both the worker agent and the proxy implement the same channel, so the sender doesn’t even need to know it’s talking to a proxy.
If asynchrony is the problem we can effectively turn asynchronous message passing into synchronous by requiring a response to every request sent to the port. In Axum, we call such two-way ports request-reply ports.
While a regular one-way port is defined like this:
input Decimal Price;
the definition of the request-reply port would have the type of the reply follow the name of the port and a colon. Like this:
input Decimal Price : Signal;
For the purposes of this example, I decided to use a value-less type Signal, defined in the Axum runtime. A more refined solution could have some useful data in the payload of the reply – for example, the time of the acceptance of the message, or the congestion level of the server (based on which the sender could dial down the rate of messages) and so on.
Sending to a request-reply port yields a value that can be used to retrieve the acknowledgment of the request – here is how:
foreach(var price in GetNextStockPrice())
{
var acknowledgment = report::Price <-- price;
receive(acknowledgment);
}
Now the sender and receiver move at the same speed: the sender will not proceed before getting an acknowledgment that the message was received on the other end of the channel. Completely sacrificing asynchrony is a heavy-handed solution but we can use it as a base to build something more sophisticated.
Instead of requesting an immediate acknowledgment we could keep on sending “optimistically” while keeping track of the pending requests, making sure we don’t have more a certain number of them in flight. Here is how I coded it up:
var acknowledgments =
new Queue<IInteractionSource<Signal>>();
foreach(var price in GetNextStockPrice())
{
var acknowledgment = report::Price <-- price;
acknowledgments.Enqueue(acknowledgment);
if( acknowledgments.Count >= 10 )
{
receive(acknowledgments.Dequeue());
}
}
Here we stash pending requests into a queue and then, having reached 10 pending requests, receive an acknowledgement from the request at the head of the queue. The protocol still requires a receive operation for each send, but the more pending operations we can have in flight, the more likely it is that the receive will complete without blocking.
What we’ve just implemented is a rather simplistic version of the Sliding Window Protocol. The protocol is used in various systems such as TCP or Microsoft SQL Server.
Another idea is to do away with acknowledgments and instead rely on the receiver to tell us when to pause the transmission. We need to introduce two new ports, On and Off for the receiver to communicate with the sender. Here is the code:
foreach(var price in GetNextStockPrice())
{
Signal offSignal;
if( tryreceive(report::Off, out offSignal) )
{
receive(report::On);
}
report::Price <-- price;
}
The tryreceive operator checks the availability of messages on the Off port. Unlike receive, tryreceive merely “peeks” into the port and doesn’t block waiting for a message. If there aren’t any messages on the Off port, the sender proceeds without slowing down. When a message does appear on the Off port, the sender consumes it and waits for a message from the port On to resume the submission.
The sender side is simple, but the receiver requires a bit more work. We again split it into the two agents: the worker processing the data and the proxy communicating with the sender and telling it when to stop or resume the transmission. First, the proxy:
public agent StockAgentProxy : channel StockTicker
{
public StockAgentProxy()
{
var worker = Stock.StockAgent.CreateInNewDomain();
bool senderIsActive = true;
int itemsQueuedUp = 0;
while(true)
{
var price = receive(PrimaryChannel::Price);
do
{
worker::Price <-- price;
// Have we reached the max capacity of
// the receiver?
if( ++itemsQueuedUp == 10 )
{
if( senderIsActive )
{
// Tell the sender to stop sending
PrimaryChannel::Off <-- Signal.Value;
senderIsActive = false;
}
// Wait for the worker to complete
// pending work
while(itemsQueuedUp-- != 0
receive(worker::On);
}
}
while( tryreceive(PrimaryChannel::Price, out price) )
if( !senderIsActive )
{
// Tell sender to resume sending
PrimaryChannel::On <-- Signal.Value;
senderIsActive = true;
}
}
}
}
The proxy starts by waiting for a message from the sender, then forwards it to the worker. Having reached the maximum number of queued up items – 10 in my example – the proxy tells the sender to stop, then waits for the worker to complete all the pending requests. The process repeats for all the pending messages on the port Price, until it drains empty. After that the sender is told to resume the transmission.
The worker agent looks like this:
public agent StockAgent : channel StockTicker
{
public StockAgent()
{
PrimaryChannel::Price ==> HandleUpdatedPrice;
}
private void HandleUpdatedPrice(Decimal price)
{
// Do some real work here ...
PrimaryChannel::On <-- Signal.Value; // signal completion
}
}
The above solution is known in networking as XON/XOFF protocol. This protocol is in fact so common that XON and XOFF commands have reserved positions in the ASCII table, as characters 17 and 19.
When would you use the Sliding Window and when the XON/XOFF protocol? It all depends on your situation. The Sliding Window requires a round-trip for each message, which is an obvious drawback. However, the sender will always stop after sending a certain number of unacknowledged messages.
The XON/XOFF might not work if the receiver fails to respond in a timely manner – for example, the sender can potentially send thousands of messages before the receiver gets around to tell it to stop. Keep this in mind when using the protocol.
Other things to consider are whether you need to resend undelivered messages, how to ensure proper message order, or whether any of this would in fact be a problem in your situation. Lots of fascinating things to think about, but well beyond the scope of this post.
Artur Laksberg,
Axumite
When we talk about Axum as a programming language, we make the point that it is not an object-oriented language, but that it is still object-aware. What do we mean by this, and is it really true that you cannot define objects with Axum?
What we mean is that the core concept of Axum is not the “object” of object-oriented programming, but agents and domains. These could be viewed as objects, of course, but have so many constraints placed on them that anyone who is a fan of OO programming would protest against Axum as an OO language. It would also be obscuring the central point that we are trying to make. On the other hand, being a .NET language means floating on a sea of objects, so Axum must be aware of the underlying platform and its central paradigm, which is inescapably object-oriented.
Then, we usually say that “in fact, you can’t even define a class in Axum,” as if to prove the point that it’s not OO. This is true, there are no ways to define classes; however, there is a way to define types, which we call “schema.” In C++ jargon, a schema type would be called a POD, something that is less than a full-fledged object. We’ve heard from some that “schema” isn’t a good name for this, so if you have a better names that works well as a language keyword, too, then we’ll be all ears.
A schema is a .NET class which contains only public properties and side-effect-free methods and a new kind of member called a ‘rule.’
Schema types are intended for use as payload definitions for channel communication, and thus are guaranteed to be deeply cloneable. The compiler generates the clone code, which is about 100x faster than reflection-based cloning. Schema instances are also guaranteed to be serializable and therefore automatically suitable for inter-process communication.
Why do we need this? If you are familiar with distributed programming, a schema is just a data-transfer object type, but with language support. The original reason for DTOs was to cut down on round-trips across the network – calling setters and getters on a remote object wasn’t really feasible. For Axum, the reason is somewhat different – it’s another constraint placed on objects – we simply cannot trust that types implementing ICloneable are doing so in a deep fashion (there’s no formal requirement to do so).
We could have built a deep-clone runtime capability based on reflection, but that would be orders of magnitude slower than the compiler-generated clone that having language support allows us to rely on.
A simple Address schema for US addresses:
schema Address
{
required String StreetAddress;
required String City;
required String State;
required String ZipCode;
rules
{
require ! String.IsNullOrEmpty(StreetAddress);
require ! String.IsNullOrEmpty(City);
require ! String.IsNullOrEmpty(State);
require ! String.IsNullOrEmpty(ZipCode);
require State.Length == 2;
require ZipCode.Length == 5 || ZipCode.Length == 10;
}
}
Specifying rules for a schema is entirely optional, but can be a useful tool both because of the runtime enforcement that it provides and for the additional information it provides the reader of the source code with. The rules are enforced when you send data to a channel port; they may only involve calls to methods that are known to be side-effect free.
Schema are versionable, meaning that the version of the schema that you use to write a serialized object and the one you use to de-serialize don’t have to be exactly the same. When de-serializing a schema instance from a stream, only the required properties need to be found in the stream; the schema may also contain a number of optional properties, which, if not present, will be given default values.
If an optional field is present in the stream but not recognized by the target schema type, the data is stored in a private data structure so that the instance can be re-serialized without losing the information.
Schema types are really simple – everything (except the type itself) is public, methods must be side-effect free, and the property definitions look like fields, i.e. you don’t get to define the implementation. Schema rules are invoked by the runtime.
We’ve discussed internally whether schema instances ought to be immutable, a property that would have all kinds of nice implications, but the code in the CTP that we hope to announce soon on this blog does not treat them as immutable. This is one area where getting feedback would be very valuable to us – should our transfer objects be immutable? In J2EE, for example, there is no strong recommendation one way or the other, but I’m thinking we should be a bit more specific.
I’m also thinking that we need to add compiler-generated equality and hash-key functions to make sure that schema have value rather than reference equality semantics. Clearly, schema types are by no means a finalized concept…
Niklas Gustafsson
Axumite