Welcome to MSDN Blogs Sign in | Join | Help
Introducing Parallel Extensions to the .NET Framework

There is no escaping from concurrency challenges... or is there?

(A slightly modified version of this article was published in the August 2008 edition of the MSDN Flash newsletter)

Dual, quad, and eight-core processors are becoming the norm. Is your application capable of utilising all available processors? In order to achieve this level of utilisation on an n-processor machine, an application ideally needs at least n threads concurrently performing operations.

Although writing multi-threaded applications has become simpler over the years, many still find it challenging. Also many of the constructs such as the .NET ThreadPool have deficiencies that make them less suitable as the manycore shift begins to accelerate.

Over the past few years, Microsoft has been hard at work developing a new set of parallel extensions for .NET, aptly named Parallel Extensions to the .NET Framework. One of the additions is a set of lightweight and scalable thread-safe data structures and synchronization primitives such as a concurrent dictionary and a spin lock.

It also includes an implementation of LINQ-to-Objects that automatically executes queries in parallel, scaling to utilise most or all of the available processors without developers explicitly managing the distribution of work across all processors. This technology is called Parallel LINQ or PLINQ and the query syntax is almost identical to that of LINQ-to-Objects.

var q = from c in customers.AsParallel()

        join r in regions on c.RegionID equals r.RegionID

        where c.City == "London" && r.Name == "Kingston"

        select c;

When a repetitive computation can be parallelised, the Parallel Extensions to the .NET Framework offers a few interesting data-oriented operations such as For and ForEach that execute a loop in which iterations may run in parallel on parallel hardware. For example the following classic foreach loop can easily be converted to a parallel foreach:

Sequential execution

è

foreach (var c in formulae)

  Calculate(f);

 

Possible parallel execution

è

Parallel.ForEach(formulae, f => Calculate(f));

 

The actual parallelisation of actions is managed by the Task Parallel Library (TPL) that provides an abstraction layer on top of raw threads. TPL uses the notion of Task that is the smallest unit of work and could potentially be executed by any thread owned by TPL. Therefore correctly identifying tasks that could be executed in parallel is crucial. It is then the job of the scheduler component of TPL to decide whether those tasks should execute sequentially or in parallel. This decision is usually made based on available system resources and possible user preferences.

// Create a task to call Calculate(object o) and schedule it for execution

Task t = Task.Create(Calculate);

 

// Other activities...

 

// Wait for this task to complete

t.Wait();

Developing concurrent applications requires a shift in mindset. For example, only a single exception can be thrown at any one time when sequentially executing a for loop. However executing the same loop in parallel may warrant the need for dealing with multiple concurrent exceptions. There are also a few parallelism blockers that should be avoided if you are planning to benefit from Parallel Extensions to the .NET Framework when it is released.

There is a lot of promise in Parallel Extensions. If you want to find out more, I’d encourage you to download the June 2008 Community Technology Preview build.
ADO.NET Data Services: CLR-based data models and navigation links

As you are probably aware, one of the new additions to the .NET Framework 3.5 SP1 is a technology called ADO.NET Data Services (code name Astoria). ADO.NET Data Services natively supports ADO.NET Entity Framework (EF) models. However Data Services it is not limited to EF.

ADO.NET Data Services can deploy consistent representations of data regardless of the underlying data source. This approach has made CLR-based data models first class citizens. See here for a comprehensive guide on using CRL-based data models.

Imagine the following class diagram that represents two entities:

 

As you can see, Library has a collection of Books. Therefore Library.Books can be considered a one-to-many resource navigation link to a collection of Books.

public class Library

{

  public int LibraryID { get; set; }

 

  public string LibraryName { get; set; }

 

  public List<Book> Books { get; set; }

}

 

public class Book

{

  public int BookID { get; set; }

 

  public string BookTitle { get; set; }

}

A simple Data Services’ resource container exposing a list of libraries may look similar to the following:

public class LibraryDataModel

{

  static List<Library> s_libraries = ...;

 

  public IQueryable<Library> Libraries

  {

    get

    {

      return s_libraries.AsQueryable();

    }

  }

}

Unfortunatly exploring this data model – by perhaps browsing to the URI that exposes this model – may fail throwing an exception similar to the following:

Request Error: The server encountered an error processing the request. The exception message is 'The property 'Books' on type 'AstoriaWeb.Library' is not a valid property. Properties whose types are collection of primitives or complex types are not supported.'

This means that ADO.NET Data Services was not able to automatically consider Books as a resource set that can be referenced using resource navigation links from Library resources. In fact, without some more information, it is really hard for Data Services to automatically expose a new resource set through a resource container.

So how can you avoid this? For this release of Data Services at least, you need to explicitly expose Books as a resource set in the resource container:

public class LibraryDataModel

{

  static List<Library> s_libraries = ...;

  static List<Book> s_books = ...;

 

  public IQueryable<Library> Libraries

  {

    get

    {

      return s_libraries.AsQueryable();

    }

  }

 

  public IQueryable<Book> Books

  {

    get

    {

      return s_books.AsQueryable();

    }

  }

}

This article was written based on .NET Framework 3.5 SP1 Beta.

.NET debugging made easier

Not sure about you but I was not aware of the existence of the DebuggerStepThroughAttribute. Debugging code can be difficult at times and any tool or mechanism that can ease this pain is always welcome.

As far as the CLR is concerned, there is no semantic attached to this attribute. However Visual Studio does not step through methods or classes that are decorated with this attribute. Although you could still use breakpoints in those methods or classes, they are never hit.

And here is how breakpoints would look in Visual Studio 2008:

 

This attribute can be applied to methods, property accessors, classes and structs. I found it to be extremely useful when I wanted to step through a method without first stepping through all property accessors used as parameters or the source instance of the method. In the example below, a.Value and b.Value are not stepped through when pressing F11 in Visual Studio. It is only the Add method that is fully debugged:

var a = new SomeValue(10);

var b = new SomeValue(150);

...

Add(a.Value, a.Value);


class SomeValue

{

  ...

 

  public int Value

  {

    [DebuggerStepThrough]

    get { return value; }

  }

}

Enjoy!

WebChannelFactory inside a WCF Service

When using a WebChannelFactory inside a WCF service that already has an OperationContext, you may need to create a new context before being able to successfully callout using a channel created by the WebChannelFactory. (Notice the line in bold)

public class RelationService : IRelationService

{

  public Relation[] GetRelations()

  {

    var factory = new WebChannelFactory<ICustomerService>(

      new Uri("http://localhost/customerservice/customers.svc"));

    var proxy = factory.CreateChannel();

    using ((IDisposable)proxy)

    using (new OperationContextScope((IContextChannel)proxy))

    {

      var customers = proxy.GetCustomers();

      return customers.Select(c => new Relation(c.Name, c.Age)).ToArray();

    }

  }

}

In the above example, GetRelations() is a RESTful service operation calling into another RESTful service located at the Uri shown above.

Without using the new context, you may get an exception similar to the following:

ProtocolException: The remote server returned an unexpected response: (405) Method Not Allowed.

When investigating further, you may notice that WCF could be using an incorrect HTTP verb for communicating with the service that exposes the GetCustomers() operation.

As it happens, here is how ICustomerService service contract looks like:

[ServiceContract]

public interface ICustomerService

{

  [OperationContract]

  [WebGet(

      UriTemplate = "/customers",

      ResponseFormat = WebMessageFormat.Xml,

      BodyStyle = WebMessageBodyStyle.Bare

  )]

  List<Customer> GetCustomers();

}

As you can see it is expecting the GET verb. Without creating a new context, WCF ends up using the POST verb which will eventually cause the above exception.

Coordination Data Structures – WriteOnce<T>

This is an article in a series of blog entries describing a set of new Coordination Data Structures (CDS) introduced in the June 2008 CTP of the Parallel Extensions for .NET Framework.

In C#, when a field declaration includes a readonly modifier, assignments to the fields introduced by the declaration can only occur as part of the declaration or in a constructor in the same class:

readonly Company _company = new Company();

The assignment window is limited to the constructor or the declaration. In many applications however, it is desirable to have the readonly behaviour (set once and read many times) without this limitation. As you might have guessed, the Parallel Extensions now provides a type (System.Threading.WriteOnce<T>) for this exact purpose.

WriteOnce<T> exposes a Value property that can only be set once. This is obviously done in a thread-safe manner:

// this is a struct, so it will not require an explicit construction

WriteOnce<Guid> wo;

wo.Value = Guid.NewGuid();

 

As you would expect, attempting to reassign the Value property will throw an exception (“The variable is already initialized”):

 

WriteOnce<T> is a thread-safe type so if trying to set the Value property concurrently, one of the setters will throw the above exception. In order to avoid this, use the TrySetValue method instead:

WriteOnce<Guid> wo;

if (wo.TrySetValue(Guid.Empty))

{

  // the value was set successfuly

}

Attempting to access the Value of a WriteOnce<T> before it has been set invalidates the WriteOnce<T> instance for all future access. The instance will be corrupt and can no longer be accessed for future gets and set.

It corrupts the instance because there is a race condition that leads to a read before the write. Proper synchronization should be used to ensure that this race cannot happen. Therefore always ensure that the value is set before accessing it. When you are not certain if the Value is set, you can access it through the TryGetValue. This will return default(T) if the value is not set and will not corrupt the state of the instance for future use:

Guid val;

if (wo.TryGetValue(out val))

{

  // the value has been set previously

}

Please note that WriteOnce<T> is a value type (struct), and as such, you need to be careful about access patterns. If you accidentally make a copy of the struct, you’ll be copying by value meaning that you will be using a replica rather than the original. For example in the code below, both assignments to what might appear as the same WriteOnce instance are allowed. The reality is however that wo1 and wo2 are two different instances:

WriteOnce<Guid> wo1;

 

Task.Create(

  s =>

  {

    WriteOnce<Guid> wo2 = (WriteOnce<Guid>)s;

 

    // assignment (1)

    wo2.Value = Guid.NewGuid();

 

    Console.WriteLine(wo2.Value);

  }

  , wo1);

 

// assignment (2)

wo1.Value = Guid.NewGuid();

In the example below however, it will not be possible to assign to wo2.Value (assignment 2 will fail). Although wo2 is a replica copy of wo1, internally it is referencing the same value object assigned in (1) which is not cloned:

WriteOnce<Guid> wo1;

 

// assignment (1)

wo1.Value = Guid.NewGuid();

 

Task.Create(

  s =>

  {

    WriteOnce<Guid> wo2 = (WriteOnce<Guid>)s;

 

    // assignment (2)

    wo2.Value = Guid.NewGuid();

  }

  , wo1);

(Thanks to Joe Duffy, Ed Essey and Stephen Toub for their inputs and support)

Coordination Data Structures – SpinLock

This is an article in a series of blog entries describing a set of new Coordination Data Structures (CDS) introduced in the June 2008 CTP of the Parallel Extensions for .NET Framework.

Waiting on locks usually result in a thread context switch and associated kernel transition which at times can be considered costly. On a multi-processor machine, often it might be more efficient to busy wait for a short period of time instead of paying the cost of performing an expensive context switch and a possible transition to kernel mode.

One approach to busy waiting is to wait in a loop repeatedly checking until the lock becomes available. These types of locks are called spin-locks.

System.Threading.SpinLock provides a mutual exclusion lock primitive where a thread trying to acquire the lock waits in a loop repeatedly checking until the lock becomes available.

When acquiring a spin-lock, it uses the CompareExchange interlocked operation to ensure that the owner of the lock is assigned in a thread-safe manner (this implementation detail is subject to change):

if(Interlocked.CompareExchange(ref _owner, newOwner, owner) == owner && ...)

Failing to become the owner of the lock, other threads will wait using a SpinWait until the lock is released (pseudo code):

public void Enter()

{
 
SpinWait sw = new SpinWait();
  ...

  while (true)

  {

    if (!IsOwnerSet())

    {

      // set the owner - atomic

      if (SetOwner())

      {

        // this thread is the owner so return

        return;

      }

    }

    // spin wait once

    sw.SpinOnce();

  }

  ...

}

One point to mention here is the effect of a spin-lock on a single-processor machine. On these systems, the spin-lock will hamper the performance by adding an unnecessary delay into the whole process. It will not result in a deadlock because after executing its quantum, the spinning thread is context-switched or pre-empted by another thread, however the time spent looping has not been productive.

Due to this behaviour, the System.Threading.SpinLock primitive ensures we always yield (switch threads) on single-processor machines instead of using busy waits. 

Unlike Monitor, SpinLock publicly offers a set of reliable enter methods for acquiring a lock (i.e. ReliableEnter and TryReliableEnter). You should probably never use the unreliable enter methods because in the case of any asynchronous exceptions such as ThreadAbortException and OutOfMemoryException, you may end up with an orphaned lock that is never released. This is particularly true in ASP.NET since it uses aborts very aggressively.

As a side-note, you may wonder how we achieve this reliability. As it happens, the default CLR host postpones async exceptions inside a finally block until the end of the block. Therefore any part of the code that could result in an orphaned lock can be included inside a finally block:

try

{

}

finally

{

  // take the lock in here

  // ...

}
Please note that another host could choose to escalate thread aborts to rude thread aborts, which can interrupt finally blocks.

 

Closer investigating of the SpinLock.Exit method reveals an overload that takes a Boolean. If this Boolean is set to true, SpinLock flushes the write buffers associated with the lock in order to ensure that all processors are immediately made aware that the lock is now available. This is more expensive but will prevent a situation where one processor is given an unfair advantage to reacquire the lock:

SpinLock sharedLock = new SpinLock();

 

while (true)

{

  sharedLock.Enter();

  sharedLock.Exit();

  // very little between Exit and retry Enter;

  // possibly no other processor would see the lock available!

}

 

There are a few more things to remember when using the SpinLock:

1-      When using the default constructor, the thread that owns the lock can enter the same lock and effectively cause a deadlock:

SpinLock l = new SpinLock();
l.Enter(); // non-blocking call

l.Enter(); // blocking call

 

2-      When using the default constructor, any thread can release the lock (This is not a recommended practice and should be avoided in normal circumstances):

SpinLock l = new SpinLock();

l.Enter();

ThreadPool.QueueUserWorkItem(delegate

{

  l.Exit();

});

3-      In the current implementation, you can change the above behaviour. Simply set the isThreadOwnerTrackingEnabled parameter to true when creating a new instance of SpinLock:
 

Setting isThreadOwnerTrackingEnabled to true changes the behaviour of the lock by:

-          Prohibiting the owner thread from trying to enter the same lock – throws a LockRecursionException; and,

-          Ensuring the Exit method is only called by the owner thread – a SynchronizationLockException is thrown otherwise

 

Something else that’s worth mentioning is that SpinLock is a struct, and as such, you need to be careful about access patterns. If you accidentally make a copy of the struct, you’ll be copying by value meaning that you will be using a replica rather than the original lock.  For example in the code below, both the main thread and the ThreadPool thread would successfully enter the lock. The morale of the story is that ideally these are not passed around, and instead they are used as local variables or member fields:

SpinLock sl = new SpinLock();

ThreadPool.QueueUserWorkItem(state =>

  {

    SpinLock theLock = (SpinLock)state;

    theLock.Enter();

  }, sl);

sl.Enter();

(Thanks to Stephen Toub, Joe Duffy and Ed Essey for their input and support)

Coordination Data Structures – LazyInit<T>

This is an article in a series of blog entries describing a set of new Coordination Data Structures (CDS) introduced in the June 2008 CTP of the Parallel Extensions for .NET Framework.

LazyInit<T> provides support for several common patterns of lazy initialization. In here we will explore some of those patterns, but first a point or two regarding lazy initialization:

-          Lazy initialization is often used when the initialization process is an expensive activity which is not always required. With this approach, the execution of the initialize procedure is deferred to immediately before when it is required.

-          In order to improve the start-up performance of an application, one may decide to defer some of the initialization activities.

The first pattern covered by LazyInit<T> is the optimistic lazy instantiation pattern. In this pattern, multiple instances of T may be created but only one of those instances is published for all threads to access. In pseudo code, the implementation of such structure may look similar to this:

public class Singleton

{

  static volatile Singleton _instance;

 

  private Singleton()

  {

    // stuff ...

  }

 

  public static Singleton Instance

  {

    get

    {

      if (_instance == null)

      {

        Singleton local = new Singleton();

        if (Interlocked.CompareExchange<Singleton>(

          ref _instance, local, null) != null)

        {

          IDisposable disposable = local as IDisposable;

          if (disposable != null)

          {

            disposable.Dispose();

          }

        }

      }

      return _instance;

    }

  }

}

As you can see, unwanted instances are disposed automatically. This optimistic approach to concurrency removes the need for a full lock, but has the potential of multiple instantiations.

The LazyInit<T> struct offers a similar functionality. Simply create an instance of the LazyInit<T> using its default constructor:

LazyInit<Company> s = new LazyInit<Company>();

// do some stuff here

// ...

Company c = s.Value;

The above example uses reflection to create an instance of Company when the Value property is first accessed. Obviously Company must have a public parameter-less constructor for this to succeed.

It is worth emphasising that the default behaviour of LazyInit does not prevent the instantiation of more than one Company object by multiple threads concurrently accessing the Value property when the value is not yet set. However it guarantees that only one of those instances is published for all threads to access.

Another pattern offered by LazyInit<T> is the pessimistic lazy initialization pattern. This is very similar to the well-known double-check locking pattern and in pseudo code looks like this:

public class Singleton

{

  static volatile Singleton _instance;

  static object s_lock = new object();

 

  private Singleton()

  {

    // stuff ...

  }

 

  public static Singleton Instance

  {

    get

    {

      // double-check locking pattern

      if (_instance == null)

      {

        lock (s_lock)

        {

          if (_instance == null)

          {

            _instance = new Singleton();

            return _instance;

          }

        }

      }

      return _instance;

    }

  }

}

As you can see, no more than a single instance of Singleton is ever instantiated. This is achieved at the cost of acquiring a lock.

LazyInit<T> also provides a similar functionally. Here are some variations of this pattern implemented using LazyInit<T>:

Pessimistic Lazy Instantiation:

LazyInit<Company> s = new LazyInit<Company>(

  () => new Company(), LazyInitMode.EnsureSingleExecution);

// do some stuff here

// ...

Company c = s.Value;

Pessimistic Lazy Initialization:

In the example below, a database connection is opened only when needed. The connection is then kept open and no other connection is created for this instance of DataAccess:

class DataAccess

{

  LazyInit<SqlConnection> cnn;

 

  public DataAccess()

  {

    cnn = new LazyInit<SqlConnection>(

      delegate

      {

        var c = new SqlConnection(

          "Server=.;Database=Northwind;Integrated Security=true");

        c.Open();

        return c;

      }, LazyInitMode.EnsureSingleExecution);

  }

 

  public int GetProductCount()

  {

    // perform a db query: this is when we actually

    // create and open a connection to the database

    using (var cmd = new SqlCommand(

      "select count(*) from products", cnn.Value))

    {

      return (int)cmd.ExecuteScalar();

    }

  }

}

Another pattern offered by LazyInit<T> allows an initialization per thread such that each thread will get its own value. In this pattern, the value from the initialization is stored in the thread local storage (TLS):

LazyInit<Company> l = new LazyInit<Company>(

  () => new Company { ID = Guid.NewGuid() },

  LazyInitMode.ThreadLocal);

 

// l.Value access from the main thread

Console.WriteLine(l.Value.ID);

 

ThreadPool.QueueUserWorkItem(delegate

  {

    // l.Value access from a ThreadPool thread

    Console.WriteLine(l.Value.ID);

  });

In the example above, the main thread and the ThreadPool thread each create a different instance of Company with different IDs:

 

Now that you have seen some different lazy initialization patterns, it is time to fully introduce the LazyInitMode enum that can be specified as a parameter to the LazyInit<T> constructor:

LazyInitMode Value

Description

AllowMultipleExecution

The initialization function may be executed multiple times if multiple threads race to initialize the value, but only one value will be published for all threads to access.

EnsureSingleExecution

The initialization function will only be executed once, even if multiple threads race to initialize the value. This value will be published for all threads to access.

ThreadLocal

The initialization function will be invoked once per thread such that each thread will get its own published value.

 

One last point, LazyInit<T> is a value type and does not have the overhead of being a class. However you need to be careful about access patterns. If you accidentally make a copy of the struct, you’ll be copying by value meaning that you will be using a replica rather than the original.

(Thanks to Stephen Toub, Joe Duffy and Ed Essey for their input and support)

How to consume REST services with WCF

As you are probably aware by now, Windows Communication Foundation (WCF) 3.5 introduced a new binding called WebHttpBinding to create and to consume REST based services. If you are new to the WCF Web Programming model then see here for more details.

There have been many articles and blogs on how to host a RESTful service. However there doesn’t seem to be much written work on how to consume these services so I thought to write a few lines on this topic.

The new WebHttpBinding is used to configure endpoints that are exposed through HTTP requests instead of SOAP messages. So you can simply call into a service by using a URI.  The URI usually includes segments that are converted into parameters for the service operation.

So the client of a service of this type requires 2 abilities: (1) Send an HTTP request, (2) Parse the response. The default response message format supported out of the box with the WebHttpBinding is “Plain old XML” (POX). It also supports JSON and raw binary data using the WebMessageEncodingBindingElement.

One way of consuming these services is by manually creating a HTTP request. The following example is consuming the ListInteresting operation from Flickr:

WebRequest request = WebRequest.Create("http://api.flickr.com/services/rest/?method=flickr.interestingness.getList&api_key=*&extras=");

WebResponse ws = request.GetResponse();

XmlSerializer s = new XmlSerializer(typeof(PhotoCollection));

PhotoCollection photos = (PhotoCollection)s.Deserialize(ws.GetResponseStream());

The idea is simple:

-          Do the HTTP request and include all the parameters as part of the URI

-          Get the response that is in XML format

-          Either parse it or deserialize it into an object

The above code works but it is not elegant: We are not using the unified programming model offered by WCF and the URL is hacked together using string concatenation. The response is also manually deserialized into an object. With WCF and the WebHttpBinding we can automate most of this.

The first step is to define our service contract:

[ServiceContract]

[XmlSerializerFormat]

public interface IFlickrApi

{

  [OperationContract]

  [WebGet(

      BodyStyle = WebMessageBodyStyle.Bare,

      ResponseFormat = WebMessageFormat.Xml,

      UriTemplate = "?method=flickr.interestingness.getList&api_key={apiKey}&extras={extras}")]

  PhotoCollection ListInteresting(string apiKey, string extras);

}

As you can see, I am specifically instructing WCF to use the XML Serializer Formatter for this. The next step is to set the client endpoint. I decided to do this inside the config file:

<system.serviceModel>

  <client>

    <endpoint address="http://api.flickr.com/services/rest"

              binding="webHttpBinding"

              behaviorConfiguration="flickr"

              contract="FlickrApp.IFlickrApi"

              name="FlickrREST" />

  </client>

 

  <behaviors>

    <endpointBehaviors>

      <behavior name="flickr">

        <webHttp/>

      </behavior>

    </endpointBehaviors>

  </behaviors>


In order to be able to use the XML Serializer Formatter, I need XML Serializable types:

  [XmlRoot("photos")]

  public class PhotoCollection

  {

    [XmlAttribute("page")]

    public int Page { get; set; }

 

    ...

 

    [XmlElement("photo")]

    public Photo[] Photos { get; set; }

 

  }

 

  public class Photo

  {

    [XmlAttribute("id")]

    public string Id { get; set; }

 

    [XmlAttribute("title")]

    public string Title { get; set; }

 

    ...

  }

The final step is to create an instance of the client proxy:

ChannelFactory<IFlickrApi> factory =

  new ChannelFactory<IFlickrApi>("FlickrREST");

var proxy = factory.CreateChannel();

var response = proxy.ListInteresting("xxxx", "yyyy");

((IDisposable)proxy).Dispose();

If you don’t like using ChannelFactory directly then you can create your proxy by deriving from ClientBase<>:

public partial class FlickrClient :

  ClientBase<IFlickrApi>, IFlickrApi

{

  public FlickrClient()

  {

  }

 

  public FlickrClient(string endpointConfigurationName) :

    base(endpointConfigurationName)

  {

  }

 

  public FlickrClient(

    string endpointConfigurationName,

    string remoteAddress) :

    base(endpointConfigurationName, remoteAddress)

  {

  }

 

  public FlickrClient(string endpointConfigurationName,

    EndpointAddress remoteAddress) :

    base(endpointConfigurationName, remoteAddress)

  {

  }

 

  public FlickrClient(Binding binding,

    EndpointAddress remoteAddress) :

    base(binding, remoteAddress)

  {

  }

 

  public PhotoCollection ListInteresting(string apiKey, string extras)

  {

    return base.Channel.ListInteresting(extras);

  }

}

Now the client code will look similar to the following:

FlickrClient proxy = new FlickrClient();

var response = proxy.ListInteresting("xxxxxx","yyyyyy");

((IDisposable)proxy).Dispose();

 

Hope the above helps.

More on self-replicating tasks

Some more stuff to remember when dealing with self-replicating tasks. (See my earlier post for an introduction to Parallel FX and self-replicating tasks):

-          Self-replicating tasks should have an inter-replica communication mechanism for communicating the progress/details of the activity. This depends on what the activity is trying to achieve. See here for an example.

-          Self-replicating tasks should have an inter-replica communication mechanism for communicating the completion of the overall activity.

-          Only use when the cost of this communication and the management of partitions is considerably less than the potential benefit gained from parallelism

-          Do not assume that the task is always replicated. It is only replicated if there are available resources. For the same reason also, do not assume that there will be a specific number of replicas.

-          In some instances, the number of replicas could far exceed the number of cores in the machine.

-          You may choose to use optimistic concurrency when it is possible to correctly deal with multiple executions of the same step.

-          In general replicating tasks are an advances feature that can be very useful in specific scenarios. Use with caution.

(All based on the first Parallel Extensions CTP)

WCF error handling and some best practices

I put together the following brief description of WCF Error Handling and some possible best practices for a customer. You may also find it useful:

 

There are 4 sets of errors that clients can expect:

 

Invalid configuration: when bindings, behaviors or any other configs are in conflict with some other settings.

Communication errors: These are the usual errors caused as the result of network communication issues such as incorrect or unreachable addresses and the unavailability of a network connection. You may receive a CommunicationException as a result this.

Service faults: By default all service side exceptions are sent to the client as FaultException.

Proxy or channel state errors: These types of errors are raised when the channel or the proxy is not in a correct state to allow for communications. For example the proxy could be in the Faulted state and attempting to use that proxy will throw an exception.

 

Why Faults instead of Exceptions?

As you are aware, WCF mainly deals with SOAP Faults instead of Exception. Here is a blog entry I wrote highlighting some the reasons on why we use faults instead of exception.

In short, a SOAP Fault provides an adequate mapping between service exceptions and their equivalent on the client.

 

Should you throw a CLR exception or should you prefer a FaultException or its derivative FaultException<T>?

As mentioned above, all service side exceptions (non FaultException derived ones) are automatically converted to a FaultException. A FaultException in itself however does not provide much information regarding the problem. It is also not possible to distinguish between different types of exceptions at the client-end as all exceptions are automatically converted to a generic FaultException. Here is an example of a case that a simple System.Exception was thrown but the client received a FaultException with no more information:

 

An exception of type 'FaultException' was caught...

Message: The server was unable to process the request due to an internal error. For more information about the error, either turn on IncludeExceptionDetailInFaults (either from ServiceBehaviorAttribute or from the <serviceDebug> configuration behavior) on the server in order to send the exception information back to the client, or turn on tracing as per the Microsoft .NET Framework 3.0 SDK documentation and inspect the server trace logs.

 

If a FaultException was used, some more information regarding the exception could have been sent to the client in a form of a FaultReason. In instances where an unhandled non-FaultException is thrown, one of the following may happen based on the instance management settings of your service:

 

For per-call services:

The service instance is disposed and the proxy throws a FaultException on the client side. All exceptions of this type will fault the channel so that the same proxy can no longer be uses. In fact, attempting to reuse or even dispose the proxy may throw the following:

 

An exception of type 'CommunicationObjectFaultedException' was caught...

Message: The communication object, System.ServiceModel.Channels.ServiceChannel, cannot be used for communication because it is in the Faulted state.

 

For sessionful services:

The session is terminated, the service instance is disposed and the channel will be in the faulted state, thus the proxy cannot be reused.

 

For singleton services:

The channel is faulted and the same proxy cannot be used. However the singleton instance will live on.

 

In all cases mentioned above, the channel moves to the Faulted state. This is however not the case when a FaultException or a FaultException<T> is used. Therefore it is strongly recommended to throw a FaultException or one of its derivatives.

 

FaultException or FaultException<T>?

A FaultException in itself does not allow the service to provide detailed information regarding the exception to the client. In fact an instance of a FaultReason is as much information as you can send to the client. A FaultReason allows for localised versions of a message to be sent to the client.

 

Also, using a FaultException, it would be hard to distinguish between different exceptions at the client-end. In scenarios that you need to distinguish between exceptions and/or provide more information regarding the error, you can use a FaultException<T>. The constructor for this class takes an instance of T. T needs to be a DataContract or at least a serializable type. The instance of T will provide more information regarding the Fault.

 

At this point it is worth mentioning that all FaultException<T> instances are automatically converted to the simpler FaultException type if no FaultContract matching the type T is defined for the operation in question. Therefore when using a FaultException<T>, it is advisable to use a FaultContract. Of course it is possible to have multiple FaultContracts defined for an operation.

 

Do you need a FaultContract?

The answer is NO but read on. When throwing a FaultException or one of its derivatives, if no FaultContracts are specified, they are all converted to a simple FaultException. Use FaultContracts however when you are using FaultException<T> and you need the detailed information provided by the instance of T or you need to be able to distinguish between different exception types.

 

BTW, FaultContracts are published as part of the metadata for the service. Therefore T will have a representation at the client-end. In an unlikely scenario that the representation of T on the client has been modified manually and no longer matched T on the service, an instance of FaultException is thrown by the client.

 

How about one-way operations?

It is not possible to specify a FaultContract for one-way operations.

 

For debugging and diagnostics only:

It is possible to enable exception details to be sent to the client. This can be done using the IncludeExceptionDetailInFaults property of the service behaviour. However the use of this option is only recommended for debugging or problem diagnosis scenarios.

 

An example of handling errors at the client:

 

try

{

  proxy.SomeOperation();

}

catch (FaultException<MyFaultInfo> ex)

{

  // only if a fault contract was specified

}

catch (FaultException ex)

{

  // any other faults

}

catch (CommunicationException ex)

{

  // any communication errors?

}

 

An interesting and useful WCF extensibility point for error handling:

WCF is a very extensible framework. You can explicitly control the behaviour of your application when an exception is thrown.

You can

-          Decide to send a fault to the client or not,

-          Replace an exception with a fault,

-          Replace a fault with another fault,

-          Perform logging,

-          Perform other custom activities

 

In order to utilise this extensibility feature, you need to implement the IErrorHandler interface. You will then need to install your custom error handler by adding it to the ErrorHandlers property of the channel dispatchers for your service.  It is possible to have more than one error handler and they are called in the order they are added to this collection.

 

IErrorHandler introduces two very interesting methods:

 

public interface IErrorHandler

{

  bool HandleError(Exception error);

  void ProvideFault(Exception error,

       MessageVersion version, ref Message fault);

}

 

Implement ProvideFault to control the fault message that is sent to the client. This method is called regardless of the type of the exception thrown by an operation in your service. If no operation is performed here, WCF assumes its default behaviour and continues as if there were no custom error handlers in place.

 

Using the ProvideFault method, it is possible to provide a new fault. Here is an example:

 

public void ProvideFault(Exception error,

      MessageVersion version, ref Message fault)

{

  FaultException newEx = new FaultException();

  MessageFault msgFault = newEx.CreateMessageFault();

  fault = Message.CreateMessage(version, msgFault, newEx.Action);

}

 

One area that you could perhaps use this approach is when you want to create a central place for converting exceptions to faults before they are sent to the client (ensuring that the instance is not disposed and the channel is not moved to the Faulted state).

 

The IErrorHandler.HandleError method on the other hand is usually used to implement error-related behaviours such as error logging, system notifications, shutting down the application, and so on. This would be my recommended place to add logging capabilities.


Correction
: IErrorHandler.HandleError can be called at multiple places inside the service, and depending on where the error is thrown, the HandleError method may or may not be called by the same thread as the operation. In other words, WCF does not make any guarantees on which thread the HandleError call may be processed.

 

Use of Exception Shielding and the Exception Handling Application Block:

One other recommended approach for dealing with service based exceptions is through the use of Exception Handling Application Block as part of the Enterprise application Library. In his blog, Guy Burstein describes how you can use EHAB with WCF.

 

Which option should you use?

It all depends on your requirements. Application blocks aim to incorporate commonly used best practices and provide a common approach for exception handling throughout your application. On the other hand, custom error handlers and fault contracts can also be very useful. For instance custom error handlers provide an excellent opportunity to automatically promote all exceptions to FaultExceptions and also to add logging capabilities to your application.

From Exceptions to Faults

A question that I often get asked during workshops on WCF is “Why does WCF use Faults instead of .NET Exceptions?”

Exceptions expose a set of limitations and possible security risks:

-          They are platform/technology specific: .NET exceptions may have no meaning on other platforms

-          Exceptions can cause tight-coupling between clients and service.

o   Clients need to understand exactly what exceptions can be thrown by each operation

-          Flowing exceptions to clients may expose service implementation details

-          Flowing exceptions to clients may expose private and personal information

-          Exceptions and exception hierarchies are not easily represented using metadata

Therefore a map between exceptions on the service and their equivalent on the client is required. This is achieved using SOAP Faults.
It is worth mentioning that exceptions that reach WCF clients are represented as FaultException.

$0.02

Digging deeper into PLINQ’s internal implementation

PLINQ is built on top of the Task Parallel Library (TPL) and promises to revolutionise the way we write programs that can benefit from the multi-core processor era. But how does it work internally?

This article assumes that you are familiar with the basics of LINQ and have an understanding PLINQ and TPL.

In this short article, I will concentrate on the techniques used by the first CTP of PLINQ to partition work streams and associate different partitions to different threads. It is worth mentioning that by default, TPL provides one thread per processor core.

The deferred execution and lazy evaluation characteristics of LINQ allow for creation of infinite size lists as the source for a LINQ query. For instance it is possible to define an enumerator that represents the set of Natural Numbers:

class NaturalNumbersEnumerable : IEnumerable<uint>

{

  public IEnumerator<uint> GetEnumerator()

  {

    uint i = 1;

    while (true)

      yield return i++;

  }

 

  IEnumerator IEnumerable.GetEnumerator()

  {

    return GetEnumerator();

  }

}

(I know that it is not really a full set of natural numbers and it can only go up to uint.MaxValue before either throwing an OverflowException or starting back from 0, but it fits the purpose of this article)

The fact of the matter is that the above list never ends because the more you call MoveNext on the enumerator the more natural numbers are generated. So somehow we need to limit the execution by using functions that limit the size of the returned list. As shown below, Take can be one of those functions:

var linq =

  new NaturalNumbersEnumerable()

    .Take(100);

 

foreach (uint i in linq)

  Console.WriteLine(i);

Executing the above code prints numbers from 1 to 100 in a sequential manner. With PLINQ however, it is possible to execute the same query using more than one thread. The work distribution algorithms used here are the focus of the remainder of this article.

An enumerator (stream) is partitioned into multiple smaller enumerators of type IQueryOperatorEnumerator<T>.  The partitioned stream is represented by PartitionedStream<T> which exposes an array of IQueryOperatorEnumerator<T> that can be iterated concurrently.

One way of building a partitioned stream is through manual creation of partitions, effectively allowing for the introduction of custom partitioning algorithms. However the PartitionedStream<T> derived type responsible for this (ManuallyPartitionedStream<T>) is an internal class at present. If you believe that such extensibility can be useful then let the product team know through here. I personally think that it would be a valuable feature as the users often have a deep understanding of the source stream and can better decide on the size and the content of each partition resulting in an improved experience.

Internally there are at least 2 different algorithms used to create partitioned streams. One algorithm partitions the stream based on a range and the other based on a hashcode.

Range based partitioning is sensitive to the type of the source stream. If the source is an indexible data type (i.e. an array or an implementer of IList<T>) then the partitioning is done based on a unique range given to each partition. Also, the source stream is shared by all partitions and there are no range overlaps between partitions.  Obviously the access to the source does not need to be synchronised, therefore the indexer of the source may need to be thread-safe.

Note that range partitioning assumes that the work required to process each element will take roughly the same time and will be largely homogenous. This might not always be true and a custom partitioning solution might perform better.

However if the source is not an IList<T>, we can no longer assume that we are dealing with a finite set of items. This in itself takes away the ability to index into the source stream. Nonetheless we still need to find a way to partition the source.

As the source can have an infinite range, we can no longer decide on the actual size of each partition. This means that each partition can grow substantially during the execution. As part of the growth phase a subset of elements from the source stream are read and stored locally in memory. When a MoveNext is called, the next item is returned from this internal buffer and if needed a new growth operation is performed synchronously.

Due to the nature of these source streams, requests for a move of the reader and the read of the current value must be synchronised using a synchronization primitive such as a lock. Therefore amazingly, you can safely assume that MoveNext() and the Current property of the source stream are not called by more than one thread at a time. Hence, in our Natural Number example, we can rely on ‘i++’ instead of Interlocked.Increment of ‘i’:

class NaturalNumbersEnumerable : IEnumerable<uint>

{

  public IEnumerator<uint> GetEnumerator()

  {

    uint i = 1;

    while (true)

    {

      yield return i;

      Interlocked.Increment(ref i);

    }

  }

...

This chunk partitioning with self-expanding partitions has a number of implications:

-          High cost of synchronization

-          Memory cost of caching sections of the source stream per partition

-          Reading data elements from the source stream that could never be used:

In the example below, it is realistic to expect the generation of more than 100 natural numbers even though Take(100) can only consume the first 100 numbers. In fact, on my dual core processor, it generated around 130 numbers

var linq =

  new NaturalNumbersEnumerable().AsParallel()

    .Take(100);

 

foreach (uint i in linq)

  Console.Write(i);

 

Read the next paragraph with a pinch of salt as I cannot find signs of its use by the current CTP!

The hash based partitioning algorithm aims to divide the source stream into smaller enumerators. The content of each partition is decided based on the hashcode provided by or for each source element.  If the source stream is an IList<T> then based on the hashcode value of the items in the list, they are assigned to partitions. If however the source stream is an ICollection<T> or any other IEnumerable<T>, it is first converted into an IList<T> and then divided between partitions. Please note that it would not be possible to convert NaturalNumbersEnumerable above to an IList<T>.
How to cancel a task in Parallel FX?

Task Parallel Library (TPL) allows you to easily cancel tasks. Effectively you need to call the Cancel method on the task in question. Imagine the simple sample below:

Task task1 = Task.Create(Foo, 10000);

static void Foo(object o)

{

  for (int i = 0; i < (int)o; i++)

  {

    // some code here

  }

}

A task is created and Foo is automatically called. At some point if you decide to cancel the task while it is running, you can call the Cancel method on the task as shown below:

Task task = Task.Create(Foo, 10000);

task.Cancel();

When you call cancel, all you are doing is to set a Boolean field to true. You can query its value through the IsCanceled property:

task.IsCanceled

When Cancel is called, if the task has not yet prepared for execution, it will not run. However if the task has already been prepared and assigned to a thread, it will start executing. The execution is not automatically interrupted and requires manual intervention in order to cancel.  Effectively you need to build the cancelation logic into your application:

static void Foo(object o)

{

  for (int i = 0; i < (int)o; i++)

  {

    // Could it have been cancelled?

    ThrowIfCurrentTaskCanceled();

    // some code here

  }

}

 

internal static void ThrowIfCurrentTaskCanceled()

{

  Task current = Task.Current;

  if (current != null && current.IsCanceled)

  {

    throw new TaskCanceledException(current);

  }

}

As you can see, it is possible to access the current Task and check to see if it has already been cancelled. One thing to remember regarding the Boolean field used by the IsCanceled property is that it is a volatile field meaning that all reads are volatile reads and have acquire semantics which can be expensive.

On a slightly different note, if you are unsure about the spelling of “Cancelled” then take a look at this post.

Which memory model?

In his blog, Eric Eilebrecht explains why when writing multithreaded applications today we should stick to the weak ECMA memory model instead of CLR’s much stronger memory model.

In principal, I have no issue with using a weaker model than the CLR memory model but my main concern is that “at what cost are we prepared to endorse a weaker model?”

Writing correct lock-free or low-lock data structures can be challenging to say the least, even when relying on a strong memory model such as CLR’s. The complexity could increase a few fold when targeting a weaker memory model.

Consciously or more often instinctively we write code based on a set of assumptions (requirements, APIs, memory models and hardware architecture). These assumptions however do change. Not always these assumptions are documented or easily transferable to other developers.

Imagine the following simplified implementation of a lock-free stack:

public class LockFreeStack<T> where T : class

{

  private LinkedListNode<T> _head;

 

  public void Push(T item)

  {

    var newNode = new LinkedListNode<T>();

    newNode.Item = item;

    do

    {

      newNode.NextNode = _head;

      //_head = newNode;

    } while (!SyncHelper.CompaneAndSwape<LinkedListNode<T>>(

      ref _head, newNode.NextNode, newNode));

  }

 

  public T Pop()

  {

    LinkedListNode<T> node;

    do

    {

      node = _head;

 

      if (node == null)

        return default(T);

      //_head = node.NextNode;

    } while (!SyncHelper.CompaneAndSwape<LinkedListNode<T>>(

      ref _head, node, node.NextNode));

 

    return node.Item;

  }

}

It makes a number of assumptions. For instance, it assumes that you are not interested in knowing the length of the stack. Adding a Count property to this class would require a massive rethink of its lock-free algorithm (this may add a possible second write to a shared memory at the time of Pop and Push). It might even require a full rewrite of the class. Therefore when assumptions change, it is not always trivial to refactor and repurpose the code. Utmost level of care must be taken to ensure that issues and bugs are avoided.

In my experience, it is often simpler to rewrite the code and use as much testing as possible on multiprocessor machines.

The second reason that writing applications for a weaker memory model today is not advisable is because we don’t have a runtime to test our applications. Many concurrency issues and bugs are discovered at the time of testing.
CLR 2.0 memory model

Memory is usually a shared resource on multithreaded systems therefore access to it must be regulated and fully specified. This specification is often called a “Memory Model”.

Optimisations performed by compilers and the emergence of multi-core processors are some of the factors testing the agility of today’s memory models. The simplest such model is the Sequential Consistency model in which all reads and writes from all threads are effectively queued and sequentially performed. Although it is simple but it hampers scalability and performance. Read this great MSDN magazine article on memory models.

I don’t want to bore you by revisiting some of the basic concepts of multiprocessor machines. However let me also remind you that processor buffers and caches heavily restrict memory models. Caches effectively can move reads and writes. For instance the value for a memory location present in a processor cache can be a copy of an earlier value for that location in the main memory, and reading that value effectively brings the read earlier in time.

Here I have tried to summarise the memory model rules for CLR 2.0 (.NET 2.0, 3.0 and 3.5) as I know them (Joe also covers the same topic here):

1.       Reads and writes cannot be introduced

2.       Reads cannot move before entering a lock and writes cannot move after exiting a lock

3.       No reads or writes can move before a volatile read or after a volatile write

4.       All writes have the effect of volatile write

5.       Duplicate adjacent reads/writes from/to the same location from the same thread can be reduced to one read/write

6.       Reads cannot move before a write to the same location from the same thread

7.       Reads and writes cannot move after a full-barrier (such as Thread.MemoryBarrier).

So what happened to the following rules?

-          “Writes cannot move past other writes from the same thread”  is prevented by rule 4

-          “Data dependence among reads and writes is never violated” is enforced by rules 4 and 6

-          “All writes have release semantics” is enforced by rule 4

-          “Reads or writes from a given thread to a given location cannot pass a write from the same thread to the same location” is enforced by rules 4

Now that we know the rules, wouldn’t it be nice to have a tool that could automatically check our .NET code to ensure that any valid reordering still keeps the semantic intact and to suggest possible additions of barriers and volatile operations? Well there is such a tool but as far as I understand, this one heavily relies on the weaker memory model defined by EMMA CLI spec and it is not complete. Also if you are interested in knowing more about memory model verification, take a look at this.

The other interesting aspect of CLR 2.0 is that it guarantees that: read and write access to properly aligned (default behaviour) memory location no larger than the native Integer (32 bit processor = 4-byte aligned, 64 bit processor = 8-byte aligned) is atomic. This is defined in section 12.6 of the ECMA CLI Spec. Therefore you can be certain that at least 4-byte assignments in CLR 2.0 are atomic.

Why do you need to know this? Well, writing good concurrent code particularly lock-free or low-lock data structures rely heavily on developers having a deep understanding of the memory model. Also, memory models can change so if you stick to a weaker memory model today such as the ECMA memory model it is more likely that your application stays forward-compatible.

More Posts Next page »
Page view tracker