Available Now: Preview of Project “Orleans” – Cloud Services at Scale

Available Now: Preview of Project “Orleans” – Cloud Services at Scale

Rate This
  • Comments 10

This post was written by Niklas Gustafsson, Principal Program Manager on the Cloud Platform Tooling Team

Today, at Build 2014, we are announcing the preview release of a cloud programming model under the codename “Orleans”. This project originated in Microsoft Research. Project “Orleans” provides a straightforward approach to building distributed high-scale computing applications, without the need to learn and apply complex concurrency or other scaling patterns. It was designed for use in the cloud, and has been used extensively in Microsoft Azure.

Project “Orleans” has been used in production at Microsoft since 2011. Publicly, it is most well-known as the basis of the Halo 4 services architecture. If you’ve played Halo 4, you’ve relied on it as part of the game experience.  The Halo 4 services team will tell you that they could not have achieved the scale and level of user interactivity they needed without Project “Orleans”.

Project “Orleans” enables scenarios that are difficult to productively implement using current technologies, complementing solutions like SignalR and WebAPI. Gaming, Internet of Things, social networks, and other scenarios with complex and dynamic relationships are particularly well suited for this model.

We think that now is the time to start getting feedback from a broader audience.  We’d appreciate you downloading the preview, building services on top of it, and telling us about your experience. We’ve had great experiences using it for products like Halo 4, but we know that we can make it better. We need your feedback to help us make the right updates.

clip_image001

Go get the bits at Microsoft Connect and check out our CodePlex site, which has samples and documentation to get you started with the Project “Orleans” preview!

Building large-scale apps is challenging

Large-scale cloud applications and services are inherently parallel and distributed. They are often interactive and dynamic; commonly requiring near real time direct communication between cloud entities. Such applications are very difficult to build today, on any platform. The development process demands expert level programmers and typically requires expensive iterations of the design and the architecture, as the workload grows.

No single solution will address all such challenges, they are too diverse and the constraints placed on systems by legacy investments and other business-specific considerations are too complex. That said, some solutions stand out as particularly promising, among them the actor model.

The Actor Model

The actor model is a long-standing model for concurrent computation, dating back to 1973 and invented by Carl Hewitt. Many languages and frameworks have been inspired and built according to this model, including multiple for .NET. It is an established pattern, so has well understood implementation approaches, semantics and capabilities.

The actor model is based on a collection of small objects that are referred to as “actors”. Everything – state and behavior – is modeled as an actor in this model. Actor is the sole primitive. The implementer defines a set of actors that makes sense for a given problem domain, and the allowable set of actions that they can have. This approach is very similar to object oriented-design principles, with a bias towards more finely defined objects. The execution mechanics is where the actor model starts differ in more obvious ways.

Much like object-oriented design, actors represent and mutate state. An entire service built with the actor model might conceptually own a database or a collection of tables within it, while an actor might represent a single row. State is typically updated based on interactions with other actors.

Actors communicate via asynchronous messages that are delivered via system-generated proxies. The actors themselves do not hold a direct reference to one another. This isolation and asynchronous communication simplifies scale-out and elasticity of applications as well as recovery from failures.

Actors are always single-threaded, in terms of execution mechanics. That means that the state held by an actor is always consistent. There is no reason to consider data contention or locking schemes, since an actor is the sole owner of its data, and only ever runs on one thread. This characteristic simplifies writing highly concurrent systems by removing data races within and between actors.

Introducing Project “Orleans”

Project “Orleans” is a distributed implementation of the actor model. It is built on .NET and exposes a .NET API for you to build your own services. Its design borrows heavily from Erlang and distributed objects systems. Project “Orleans” is built on .NET, so it makes sense to take advantage of .NET features to expose an integrated .NET programming model. It relies on static typing, message indirection and actor virtualization, which differentiate it from other Actor Model implementations.

Its main benefits are: 1) developer productivity, even for programmers who are not distributed systems experts; and 2) transparent scalability “by default,” that is, with no special effort from the programmer.

The primary reason for the preview release is to get your feedback on the developer productivity aspects of the library. Please share any and all feedback you have about the API, and if you have feedback on scalability, we’re interested in hearing from you on that topic, too. Tell us how easy it is to learn the API and build scalable services using it.

Familiar programming paradigm

Actors are .NET classes that implement .NET actor communication interfaces with asynchronous (Task & Task<T>-based) methods and properties. Thus interactions between actors appear to the programmer as remote objects whose methods/properties can be directly invoked, albeit always asynchronously.

This provides the programmer with a familiar paradigm by turning messages into method calls, routed to the right endpoints, invoking the target actor’s methods and dealing with failures and corner cases in a completely transparent way.

A simple ”hello world” interface may look like this:

public interface IHello : Orleans.IGrain
{
Task<string> SayHelloAsync(string greeting);
}

Every actor system seems to pick its own concrete name for the actor concept. Erlang calls them ‘processes,’ and Orleans calls them ‘grains.’ Thus, the presence of ‘IGrain’ as the base interface above is what marks it as an interface to an actor.

An actor implementation that knows how to follow the conventions of the IHello interface might look something like this:

public class HelloGrain : Orleans.GrainBase, IHello
{
Task<string> IHello.SayHelloAsync(string greeting)
{
return Task.FromResult("You said: '" + greeting + "', I say: Hello!");
}
}

There are restrictions on what you can do in interfaces and classes intended to represent actors, but they are relatively few. For example, all methods and properties in the interfaces must return Task or Task<T>, and you should never create an actor instance directly using ‘new.’

Invocation of actors and transparent activation

The runtime activates an actor as-needed, only when there is a message for it to process. This cleanly separates the notion of creating a reference to an actor, which is visible to and controlled by application code, and physical activation of the actor in memory, which is transparent to the application.

This model is similar to virtual memory in that it decides when to “page out” (deactivate) or “page in” (activate) an actor; the application has uninterrupted access to the full “memory space” of logically created actors, whether or not they are in the physical memory at any particular point in time.

Transparent activation enables dynamic, adaptive load balancing via placement of actors across the pool of hardware resources. This featureis an improvement on the traditional actor model, in which actor lifetime and placement is application-managed.

If I’m writing a client of a system based on Project “Orleans” and need to establish a link to a particular actor instance, I would use the actor’s factory method and the actor’s global identity (each actor instance is identified by some unique key, for example a Guid). The factory methods are generated by the system as you compile actor interfaces.

What is returned by the factory method is a reference to a proxy for the actual actor. Only when a message is sent to the actor, that is, my code calls the ‘SayHelloAsync()’ method on the proxy, does the actor get activated in memory somewhere.

In this much-simplified HelloWorld case, where there is only one instance, ‘0’ seems like a good identity:

IHello friend = HelloFactory.GetGrain(0);
Console.WriteLine(await friend.SayHelloAsync("Good morning!"));

Location transparency

An actor reference is a proxy object returned by the ‘GetGrain()’ factory method; the programmer uses the reference to invoke the actor’s methods or pass to other components. The reference only contains the logical identity of the actor. The translation of the actor’s logical identity to its physical location and the corresponding routing of messages are done transparently by the runtime. Application code communicates with actors independent of their physical location, which may change over time.

Single-threaded execution of actors

The runtime guarantees that an actor never executes on more than one thread at a time. Combined with the isolation of its state from other actors, the programmer never faces concurrency at the actor level, and hence never needs to use locks or other synchronization mechanisms to control access to data held by an actor. This freedom from data races is in many ways the core feature of the actor model, making it relatively straight-forward to program concurrent systems.

For example, and this is a bit of a contrived example, we can add a list of greetings and not worry about concurrent accesses to it, since invocations are arranged by the runtime to guarantee single-thread access:

public class HelloGrain : Orleans.GrainBase, IHello
{
List<String> greetings = new List<string>();
Task<string> IHello.SayHelloAsync(string greeting)
{
if (greetings.Count < 100)
greetings.Add(greeting);
return Task.FromResult("You said: '" + greeting + "', I say: Hello!");
}
}

In most circumstances, checking the length of the list and adding an element to it would have to be protected by some sort of mutual exclusion mechanism. With actors, mutual exclusion comes with the programming model.

Transparent integration with persistent store

Project “Orleans” allows for declarative mapping of actors’ in-memory state to a persistent store. It synchronizes updates, transparently guaranteeing that callers receive results only after the persistent state has been successfully updated.

For example, we may want to keep the list of greetings around even when a grain is deactivated due to disuse or system failures. The way to do that is to identify a persistent storage provider in your configuration and mark an actor as relying on the configured provider.

The runtime expects you to separate the persisted state in a type of its own, such as in this case:

[StorageProvider(ProviderName = "MyPersistentStore")]
public class HelloGrain : Orleans.GrainBase<IHelloState>, IHello
{
async Task<string> IHello.SayHelloAsync(string greeting)
{
if (State.Greetings.Count < 100)
State.Greetings.Add(greeting);
await base.WriteStateAsync();
return "You said: '" + greeting + "', I say: Hello!";
}
}
public interface IHelloState : IGrainState
{
List<String> Greetings { get; set; }
}

Note that the grain class has been modified so that it now extends GrainBase<IHelloState>, ‘IHelloState’ being the interface type used to define the persistent portion of the state. ‘State’ is a protected property found in the GrainBase<T> base class and refers to the persisted state.

We also had to add a call to WriteStateAsync() in the GrainBase<T> base class, which is how you control when state is written out to the persistent store. Since WriteStateAsync is an asynchronous method, the SayHelloAsync method has also been made asynchronous with the async keyword and can use the ‘await’ operator to wait for the write operation to finish.

Automatic code-generation at build time fills in the details, generating a class that implements the state interface; that is where the logic to map the state to something that can be stored is placed. Together with a configuration file entry shown below, this is all it takes to get the list backed up in an Azure Storage table:

<Provider Type="Orleans.Storage.AzureTableStorage" 
Name="MyPersistentStore"
DataConnectionString="&lt;&lt;Your Azure Storage Connection String>>" />

Azure Storage is not the only place you can store data. While there aren’t a lot of options “in the box,” anyone may extend the storage options by adding custom storage providers. These are not hard to write, and there’s a sample storage provider solution on CodePlex demonstrating a file-system-based solution and one using MongoDB.

Where to Get It

The preview bits themselves are available via Microsoft Connect. You will be asked to fill out a survey as part of downloading the preview.

Download the Project "Orleans" Preview

We have also set up orleans.codeplex.com on CodePlex where samples and some documentation are hosted. We intend to keep adding samples as we go along with the preview. The CodePlex site’s forum can be used to discuss the preview and its uses.

Get Project “Orleans” Samples and Documentation

Please provide us with feedback on bugs via the Connect site, or send us your thoughts directly if you are more comfortable doing it that way. You can reach us via email at orlprevsupp@microsoft.com.

On behalf of the entire Project “Orleans” team at Microsoft Research, as well as the team in DevDiv working on it, I hope you enjoy taking part in the preview and find it a good use of your time!

Leave a Comment
  • Please add 4 and 3 and type the answer here:
  • Post
  • Do you support (or plan to support) something like transactions or (explicit) synchronization? With every action being async there needs to be a way to organize interaction of multiple actors - or is this just me being ignorant of proper 'actor based design'?

    As long as actions just happen on one actor/grain I understand that you can always provide an explicit implementation of your "composite" action, which due to implicit synchronization (single threaded exceution) will be 'atomic'. But if you need to perform an action across different actors/grains and maintain some consistency, how would you do that?

  • The basic concept of the actor model is state isolation, which means that each actor, while potentially fine-grained, should be coarse enough to "own" all the data that needs to be manipulated atomically.  This may be easy or hard to do, depending on the data design and consistency requirements.

    Many, but not all, systems can be designed to not require full immediate consistency. Eventual consistency, for example, is good enough in many situations. On the other hand, in many situations, it is neither possible to contain all the data that need atomicity in a single actor, nor avoid the need for full transactions, and Orleans may currently not be a good solution in that situation.

    That said, support for cross-actor transactions is a topic under investigation.

  • Why not evolve the Project Orleans based on the TDF (TPL Data Flow) framework?

  • Daniel,

    You make a good observation, TPL Dataflow has many aspects that make it seem like the perfect foundation for an actor library. In fact, if what you need is a single-address-space actor library, you can probably build one using TPL Dataflow in a fairly modest amount of time.

    Project "Orleans" is an effort to try to extend the model to a distributed environment, which requires a lot more scaffolding and it's not clear that TPL Dataflow would be the right place to start. It might be, but that's not where the Orleans team in MSR began their journey.

    Fortunately, since both rely on Task / Task<T>, composition of TDF and Orleans code is fairly straight-forward, and leveraging TDF within actors for processing repetitive data would be an interesting thing to try. I'm sensing an opportunity for a new sample... :-)

  • Is orleans a tool like twitter storm that does real-time analytics on big data?

  • Hi, this is pretty amazing. I'm fairly familiar with Erlang, and all the Actor implementation on the .Net space I've seen so far didn't have the distribution aspect in mind, it's great to see that you guys have tackled this head on in your design.

    I do have a couple of questions regarding error handling though:

    1) Is there support for explicit linking and supervision of actors?

    2) How do errors propagate through the network of actors?

    Erlang's 'let it crash' philosophy is in part thanks to its ability to monitor errors in a hierarchy of actors and recover gracefully via coordinated restarts based on configured strategies. Do you get similar capabilities in Orleans as well?

  • Mathi,

    No, it is not such a tool, but (depending on your requirements) you may be able to build one on the programming model offered by Project "Orleans."

  • Yan,

    No, Orleans follows the .NET asynchronous error propagation model, which is to propagate errors via the Task that each actor operation must return. This means that it composes well with other .NET technologies. The restart strategies are at present built into the runtime, but may potentially be opened up for configuration in the future.

  • When do you think the estimated RTM of Project "Orleans" would be?  I would love to use it on a project I am working on.

    Thank you,

    F.

  • Frank,

    We're not sure yet, but if you contact us at the email address above we can talk more about your plans and scenarios.

Page 1 of 1 (10 items)