Nicholas Allen's Indigo Blog

Windows Communication Foundation From the Inside

February, 2007

  • Nicholas Allen's Indigo Blog

    A Trick with Faults (Discussion)

    • 3 Comments
    The code yesterday was meant to motivate a side-discussion on how faults get generated and handled between the server and client proxy. If you tried running that sample, then you would have seen that despite the FaultException being thrown on the service, the service call completes normally. The return value of the service call is a fault message. If you've been writing your contracts with typed messages instead of the raw Message type, then this is the opposite behavior to what you're used to seeing. Using the same pattern for exception handling doesn't work between typed and untyped messages. This is particularly messy when you have a mix of typed and untyped operation contracts on the same service because it requires some duplicated logic for handling errors. However, I think that would be a pretty rare service design.

    There are four cases that I think are interesting to look at so that you can see the different fault behaviors that could occur.

    Untyped fault exception with an untyped message contract

    The basic case from yesterday is to receive a fault message.

    <s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing">
    <s:Header>
    <a:Action s:mustUnderstand="1">http://www.w3.org/2005/08/addressing/soap/fault</a:Action>
    <a:RelatesTo>urn:uuid:dd129ffe-a8ff-4a70-ad6f-ad48085e94e8</a:RelatesTo>
    <a:To s:mustUnderstand="1">http://www.w3.org/2005/08/addressing/anonymous</a:To>
    </s:Header>
    <s:Body>
    <s:Fault>
    <s:Code>
    <s:Value>s:Sender</s:Value>
    </s:Code>
    <s:Reason>
    <s:Text xml:lang="en-US">boo!</s:Text>
    </s:Reason>
    </s:Fault>
    </s:Body>
    </s:Envelope>

    Typed fault exception with an untyped message contract

    I'm just changing the FaultException to a FaultException<string> here, although you can have any type you want for the fault detail. This changes the contents of the fault message but not the code path. Note that the action is different in addition to the detail section to match the parameterized type.

    <s:Envelope xmlns:s="http://www.w3.org/2003/05/soap-envelope" xmlns:a="http://www.w3.org/2005/08/addressing">
    <s:Header>
    <a:Action s:mustUnderstand="1">http://tempuri.org/IService/VerbStringFault</a:Action>
    <a:RelatesTo>urn:uuid:49ee87c7-691f-48c4-86ea-bb172c99294d</a:RelatesTo>
    <a:To s:mustUnderstand="1">http://www.w3.org/2005/08/addressing/anonymous</a:To>
    </s:Header>
    <s:Body>
    <s:Fault>
    <s:Code>
    <s:Value>s:Sender</s:Value>
    </s:Code>
    <s:Reason>
    <s:Text xml:lang="en-US">The creator of this fault did not specify a Reason.</s:Text>
    </s:Reason>
    <s:Detail>
    <string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">boo!</string>
    </s:Detail>
    </s:Fault>
    </s:Body>
    </s:Envelope>

    Untyped fault exception with a typed message contract

    As soon as we lose the untyped message (any return type but Message, even void), then we get an entirely different code path on the client. Instead of a return value, an exception of type FaultException gets thrown from the proxy.

    System.ServiceModel.FaultException: boo!
    Server stack trace:
    at System.ServiceModel.Channels.ServiceChannel.HandleReply(ProxyOperationRuntime operation, ProxyRpc& rpc)
    at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs, TimeSpan timeout)
    at System.ServiceModel.Channels.ServiceChannel.Call(String action, Boolean oneway, ProxyOperationRuntime operation, Object[] ins, Object[] outs)
    at System.ServiceModel.Channels.ServiceChannelProxy.InvokeService(IMethodCallMessage methodCall, ProxyOperationRuntime operation)
    at System.ServiceModel.Channels.ServiceChannelProxy.Invoke(IMessage message)
    Exception rethrown at [0]:
    at System.Runtime.Remoting.Proxies.RealProxy.HandleReturnMessage(IMessage reqMsg, IMessage retMsg)
    at System.Runtime.Remoting.Proxies.RealProxy.PrivateInvoke(MessageData& msgData, Int32 type)
    at IService.Verb(Message input)

    Typed fault exception with a typed message contract

    With typed messages, the fault exception type is now important for error handling. The parameterized server FaultException comes to the client as the same parameterized type (so FaultException), assuming that we've set up the fault contract for the service correctly. This means that we have the same basic code path but may be selecting a different case to apply.

    System.ServiceModel.FaultException`1[System.String]: The creator of this fault did not specify a Reason. (Fault Detail is equal to boo!).

    Next time: Flow of Messages, Part 1

  • Nicholas Allen's Indigo Blog

    A Trick with Faults

    • 1 Comments

    What does this code print? It seems like both choices are quite reasonable. I'll have some discussion about this tomorrow.

    [ServiceContract]
    interface IService
    {
    [OperationContract(Action="foo")]
    Message Verb(Message input);
    }

    class Service : IService
    {
    public Message Verb(Message input)
    {
    throw new FaultException("boo!");
    }
    }

    class Program
    {
    static void Service()
    {
    ServiceHost host = new ServiceHost(typeof(Service), new Uri("net.tcp://localhost/"));
    host.AddServiceEndpoint(typeof(IService), new NetTcpBinding(), "");
    host.Open();
    Console.ReadLine();
    }

    static void Main(string[] args)
    {
    new Thread(new ThreadStart(Service)).Start();
    Binding binding = new NetTcpBinding();
    ChannelFactory<IService> factory = new ChannelFactory<IService>(binding, "net.tcp://localhost/");
    IService proxy = factory.CreateChannel();
    try
    {
    Message response = proxy.Verb(Message.CreateMessage(binding.MessageVersion, "foo"));
    Console.WriteLine("Received message");
    Console.WriteLine(response.ToString());
    }
    catch (FaultException fault)
    {
    Console.WriteLine("Received fault");
    Console.WriteLine(fault.ToString());
    }
    }
    }

    Next time: A Trick with Faults (Discussion)

  • Nicholas Allen's Indigo Blog

    Channels Illustrated

    • 1 Comments

    In the channel development series last week, we looked at the characteristics of channels (protocol channels, transport channels, and why you would write a channel at all). Let's use a specific example to illustrate those points. Although the protocol for reliable messaging is quite complex, the basic intent of the channel can be described quite simply. We'll keep the discussion simpler by only talking about the send side of the server response. You can map this yourself to the client side or to receiving messages, but this example doesn't need a lot of details to get the point across.

    Here's a channel stack that contains the reliable messaging channel.

    When the server sends a message out to the client, that message passes through the upper protocol channels in the channel stack. Those protocol channels can create, alter, or destroy messages along the way, but let's say that the message arrives intact to the reliable messaging channel. The reliable messaging channel first makes a copy of the message just in case delivery fails in the future. The reliable messaging channel then sends the message down through the remaining protocol channels. Again, we'll assume that nothing happens along the way; the message gets to the transport and is sent over the network. Sometimes, despite the best efforts of this process, the client never receives the message. In that case, the reliable messaging channel will produce another copy of the message from its store and send it again. Eventually, the client will either acknowledge receipt of the message or the two sides will decide to give up due to their inability to communicate. When the message is acknowledged, the reliable messaging channel throws its copy away as the message is no longer needed.

    Let's check against the criteria for writing channels to see if a channel was really needed to perform reliable messaging.

    1. Did we need a new component that interfaces with the network? No, there was some other component in the channel stack that handled the network for us.
    2. Did we need to establish a pattern of message exchange? Yes, the pattern of retries and acknowledgement is something that's different from the exchange inherent in a service call.
    3. Did we need to have a protocol for expressing the messages? Yes, although we didn’t talk about the details of that protocol in this example. We need some way of describing what data is, what acknowledgments are, and when retries are occurring.
    4. Did we need something other than a one-to-one correspondence between input and output? Yes, there could have been any number of output messages due to retries.
    5. Did we need to perform an operation that cannot be represented as a method call? Yes, there's no way to make a method call that would compose with the other protocol channels.

    It looks like channels are the only extensibility point that can perform reliable messaging.

    Next time: A Trick with Faults

  • Nicholas Allen's Indigo Blog

    Tuning Contracts for Performance

    • 1 Comments

    I have a service contract with a few operations that take large inputs and do a lot of processing. If I configure the service quotas with small values to prevent too many of the expensive operations from happening at once, then the overall throughput is very bad. If I configure the service quotas with large values, then the expensive operations could be called many times and the server will run out of resources.

    I've left the question off of this one because I'm not actually going to talk about this problem. Instead, the description was a good excuse to talk about the design of service contracts.

    Long-time programmers will recognize the discussion on chatty versus chunky interface design. Chatty interfaces break operations up into small units of work. It takes a lot of chatty method calls to get something done, but because everything is nicely componentized, it's possible to reassemble the pieces in ways that the original designer didn't think of. Chunky interfaces map large, user-scale operations to a single method call. It only takes a few chunky method calls to get something done, but the operations chosen for the interface are really all you can do.

    In the non-distributed application world, it often doesn't matter whether your interfaces are chatty or chunky. There were some exceptions. For example, operating system calls have a kernel-mode transition cost so it's beneficial to get a lot of work done with a single call. Every iteration between kernel and user mode results in paying the transition cost. For distributed applications, there is a similar large transition cost called the network. Each trip back to the network introduces a lot of latency relative to the typical computational costs of an operation. COM programming really threw the distinction in your face because it was suddenly a lot harder to predict whether a given method call would incur an expensive transition. You had to program defensively and use lots of chunky interfaces to avoid unpredictable costs at runtime.

    Typically, calls between application tiers have to be done through chunky interfaces. Calls within an application tier can afford to be chattier. HTTP applications in particular tend to be extremely chunky. Often, requesting a resource will return every single piece of information about that resource and you don't have to make any trips to the server again. This trend has swung back around recently but mostly because the resources are too large to transmit at once (maps of the world, my mail box).

    Back to the original topic, the situation sounds a lot like an interface that has mixed together chatty and chunky operations. It has operations with greatly differing granularity. Such interfaces usually end up with the worst of both worlds in terms of performance and usability. The performance characteristics of the interface are going to be bad unless the client and service are located nearby. However, the interface isn't giving you the flexibility you'd have for a typical local service. This scenario would be a good candidate to look at whether someone has thrown multiple service contracts behind one interface without thinking about the overall design.

    Next time: Channels Illustrated

  • Nicholas Allen's Indigo Blog

    Transport Channels

    • 2 Comments

    Let's shift gears for a bit and talk about transport channels now as opposed to protocol channels. Everything that was said yesterday for channel stacks is still true when we add transport channels to the picture. Everything that was said yesterday for protocol channels is essentially going to be reversed.

    Here are the essential points that change when we move from protocol channels to transport channels.

    Protocol channels have two sides: an inner and an outer channel.
    Transport channels have one side. The inner end of the transport channel connects to the network.

    Obvious corollary:

    Protocol channels move messages up and down the channel stack.
    Transport channels move messages to and from the network.

    More obvious corollary:

    Every channel in the channel stack but the bottommost one is a protocol channel.
    The bottommost channel in the channel stack is a transport channel.

    Protocol channels work on messages that are XML InfoSets.
    Transport channels operate on a stream of bytes.

    As we go through the architecture, you'll notice that transport channels are the only time during messaging that we have actual bytes. This is a necessity as traditional networking interfaces, such as sockets, have no idea how to send an XML message. The only data type that these interfaces rationalize about is byte arrays. The transport channel is the component that is responsible for converting between XML messages and byte streams. There is a standard pattern for delegating this behavior from transports to an external component called a message encoder. A message encoder is an interface that allows third-party developers to plug conversion functionality into a transport. You can choose to either use or ignore this model, although I'd strongly suggest supporting encoders if your transport permits multiple byte encodings for an XML message.

    The transport channel is also responsible for doing any post-processing on the byte stream. For example, there is a common step called framing in which the payload data is embedded inside a structured format for sending messages. Unlike message encoders, there is no standard interface for applying framing to a byte stream. The current transports that support framing build their framing protocol directly into the transport channel.

    We're going to take a break from the channel development series for a day tomorrow. When we come back, there will be some catch-up for illustrations and then we go directly into the second topic to overview WCF architecture.

    Next time: Tuning Contracts for Performance

  • Nicholas Allen's Indigo Blog

    Protocol Channels

    • 5 Comments

    There are only two kinds of channels in the world. Today we'll talk about protocol channels. Tomorrow we'll talk about transport channels.

    1. Transport channels move data to and from the network
    2. Protocol channels move data between the application and transport channel

    The name "protocol channel" for the second group is a lie. Of the channels we shipped in the product, the most common representative for the second group happens to be protocols. That doesn't mean that every channel in the second group has to an implement a protocol though. The key distinction for the second group is really "moves data between the application and transport channel". In other words, the second group is made up of everything that is not a transport channel. However, if you're following the advice from yesterday about when to write a channel, then I think you'll find that most of the channels that you write are either transport channel or implement some kind of protocol.

    A channel has two ends to it. The outer end points to the application. The inner end points to the network. If you start stitching together channels end-to-end, you come out with something called the channel stack. The inner end of one channel links to the outer end of another channel. In relative terms, these are the inner and outer channels of the channel that we're looking at.

    Obviously, we run into a problem if we keep following down the path of inner (or outer) channels. If we follow down the path of inner channels, eventually we'll hit a channel that has no other channel on the inner end to pass data to. That innermost channel is the transport channel that we'll talk about tomorrow. If we follow up the path of outer channels, eventually we'll hit a channel that has no other channel on the outer end to pass data to. Beyond the outermost channel is the application. In between the outermost channel and user code is a large mass called the service model. We'll have a brief look at the architecture of the service model in a few days. Otherwise, there's nothing particular distinctive about the outermost channel. Every channel along the way, except the innermost channel, is a protocol channel.

    The job of a protocol channel is to pump data between its inner and outer sides. That data is a structure called the XML InfoSet. An infoset is a standardized concept but really you don't have to understand the standard to make use of it. The pump that acts on these infosets is any arbitrary code you care to write. As a channel author, you have total control over what happens to the infosets you receive and how you come up with the infosets you send. There is a very rigid contract for how channels plug together but no contract for what a channel does with messages.

    Next time: Transport Channels

  • Nicholas Allen's Indigo Blog

    When to Write a Channel

    • 5 Comments

    Today's article is about the tension between two simple points.

    • Writing channels can generally be used to solve just about any problem in WCF
    • Writing channels is generally the most time-consuming way to solve a problem in WCF

    The key inference that you should be taking away from this discussion follows pretty directly from these points.

    Just because you can go out and solve your problem by writing a channel doesn't mean that writing a channel is the right answer to your problem.

    How do you know if there's a more cost-effective solution to your problem than writing a channel? Well, that's always going to be a difficult question to answer because there are many very subtle points that cause solutions to explode in complexity. In contrast, channels have a very regular and ordered structure that makes it easy to reason about systems composed from channels (we'll be looking at this structure a lot more in the future). We can exploit the regularity of the channel model by identifying types of problems that work very naturally with channels and by identifying other types of problems that most likely have a better solution elsewhere. Think of this as a pocket-guide to helping you make the decision.

    1. Channels are the component in WCF that interfaces with the network. If you have to take some direct action on a network resource, either sending or receiving data, then you are typically talking about building or using a channel.
    2. Channels are the component in WCF that establish a pattern of message exchange. When you're having a conversation, there are social rules that dictate when someone can start talking. Different social situations have different sets of rules. In WCF messaging, channels provide the equivalent of these rules of exchange.
    3. Similarly, channels represent the social protocols for how to format ideas for exchange but say nothing about what ideas can be exchanged.
    4. In the most basic messaging pipeline, there is a one-to-one correspondence between messages entering and leaving the pipeline. Many of the components of WCF follow this simple pipeline model but channels do not. Channels take an arbitrary number of input messages (even zero) and produce an arbitrary number of output messages (again, even zero).
    5. Just because you have a message-processing system doesn't mean that you can't make method calls. For example, if all you want to do is execute some SQL query from within your service, then use ADO.NET. If you have built some predefined send and receive operations that just happen to have been implemented against a database, that's when you write a channel.

    Next time: Protocol Channels

  • Nicholas Allen's Indigo Blog

    Channel Development Tour, Part 1

    • 4 Comments

    This is the start of a long series on channel development. Some of the material in the series is going to duplicate topics that I've written about in the past. That's ok. The goal of the series is to have a walkthrough that is self-contained and in one place that is easy to read through. Many of those older articles are for older versions of WCF and may differ slightly from what was shipped in the final version. Everything in this series is going to talk about the V1 version of WCF. As an added bonus, everything here should still be true in the next version of WCF and future versions after that. This is the advantage of having to live with backwards compatibility. Future versions of WCF might make it easier to do the things that I talk about, but the methods in this series should continue to work forever.

    In the end, it should be possible to stitch the articles in this series together into one massive blob of text although I probably won't go that far.

    Here's what the series is going to cover:

    1. Background on the role of channels
    2. WCF and the channel model architecture
    3. Basic walkthrough of writing channels
    4. Writing a simple protocol channel
    5. Advanced walkthrough of writing channels
    6. Writing a simple transport channel
    7. Specialty topics for writing channels

    The earlier topics consist of 4 or 5 articles each. The later topics consist of around 10 articles each. As you can see, this is going to be a really long series in total. To counter that, I'm not going to run articles in this series all 5 days a week. You will probably get 3 articles of this series per week mixed in with other unconnected topics.

    Here are the four articles in topic #1:

    1. The Introduction (that's this article you're reading now)
    2. When to Write a Channel
    3. Protocol Channels
    4. Transport Channels

    I'm not going to do an introduction article for each topic so topic #2 starts directly in the fifth article.

    Next time: When to Write a Channel

  • Nicholas Allen's Indigo Blog

    Table of Contents Scratch Work

    • 4 Comments

    I haven't forgotten about the goal to put together a table of contents for all of these articles. The part I find hardest about this process is taking the articles that talk about five or six topics and figuring out a single place where they should go. I want to avoid duplicating and cross-linking articles at this level. Here's an example of the products I'm coming up with along the way. I've populated this list with just the articles tagged as involving "messages". That's about 10% of the total number of articles. It takes a long time to get a working organization and design.

    1. Messaging
      1. Messages
        1. Application messages
          1. Reducing Memory Usage with Large Messages
          2. Mixing Message Contract Attributes
          3. Splitting Up XML Text Nodes
          4. Correlating Message Identifiers
          5. You Must Understand This
          6. This message cannot support the operation because it has been copied
          7. Maximum Size of a SOAP Message
        2. Formats
          1. Versioning for Addresses, Envelopes, and Messages
          2. Manual Addressing
          3. Mixed Mode Addressing
          4. The Mixed Mode Addressing Picture
        3. Serialization
          1. Using XML Serialization with WCF
        4. Encodings
          1. Text
          2. Binary
          3. Extensibility
        5. Extensibility
          1. Get the Message
          2. Introducing MessageState
      2. Message protocols
        1. Reliability
        2. Security
          1. Securing Custom Headers, Version 1
          2. Securing Custom Headers, Version 2
        3. Extensibility
      3. Network transports
        1. HTTP
          1. Making One-Way HTTP Requests, Part 2
          2. Making One-Way HTTP Requests, Part 3
          3. Status codes
            1. Modifying HTTP Error Codes, Part 1
            2. Modifying HTTP Error Codes, Part 2
            3. Faults and HTTP
        2. TCP
        3. Named pipes
        4. Queues
        5. Extensibility
      4. Delivery failure
        1. Basics of Failure
        2. Fault messages
          1. Actions for FaultExceptions
          2. A Historical, Awkwardly Named Fault
          3. The Most Distinguished Fault
        3. Sending faults
          1. Designing New Faults
          2. Creating Faults, Part 1
          3. Creating Faults, Part 2
          4. Creating Faults, Part 3
        4. Receiving faults
          1. Zen Faults
          2. Consuming Faults, Part 1
          3. Consuming Faults, Part 2

    Next time: Channel Development Tour, Part 1

  • Nicholas Allen's Indigo Blog

    Stashing Data in Extensible Objects

    • 3 Comments

    How do I store some state about the current request so that I can use it later during the same service operation?

    There are several different standard contexts in which state can be stored. Each of them works the same so I'll present all of them together today.

    • State that has the same lifetime as the service host lives inside the service host.
    • State that has the same lifetime as the current service instance lives inside the InstanceContext.
    • State that has the same lifetime as the current service operation lives in inside the OperationContext.

    The mechanism for storing and retrieving this state is IExtensionCollection<T>. There is an IExtensionCollection<ServiceHostBase>, an IExtensionCollection<InstanceContext>, and an IExtensionCollection<OperationContext>. These three collection classes are on the corresponding service classes as a member called Extensions. For each IExtensionCollection<T>, you can define a new class that implements IExtension<T> and contains whatever state data you want. You drop a state object into the Extensions collection when you have something to record. When you want to retrieve the state, you search the Extensions collection for an instance whose type matches the one you created. Let's look at an example.

    class MyServiceState : IExtension<ServiceHostBase>
    {
    public int ServiceCallCounter = 0;

    public void Attach(ServiceHostBase owner)
    {
    }

    public void Detach(ServiceHostBase owner)
    {
    }
    }

    Attach and Detach are called when the object is added to or removed from the collection. The timing isn't always consistent here. Detach is called before removal when removing a single item but after removal when removing all items. Attach is always called before the item is added to the collection.

    I can then add my state to the service host at some point after it's created.

    host.Extensions.Add(new MyServiceState());

    Now, inside one of my service operations, I can fish out this state and use it in some way.

    OperationContext.Current.Host.Extensions.Find<MyServiceState>().ServiceCallCounter++;

    If I wanted the state to have a different lifetime, then I would change the IExtension type and attach it to a different context object.

    Next time: Table of Contents Scratch Work

  • Nicholas Allen's Indigo Blog

    Jobs, Jobs, Jobs

    • 2 Comments

    How do I find out about jobs working on WCF?

    That's an excellent question. I went to the Microsoft career site and I had a hell of a time finding the jobs that I knew we had available. I ended up finding them by reverse engineering the positions I knew were open until I got search queries that found those positions. In the end, I still couldn't find all the jobs, but I did get to a query that found a pretty good number. This query also has very few false positives, although there are a few jobs that have nothing to do with WCF or future products. You should hopefully be able to spot the fakes immediately upon reading the first paragraph of the job description.

    Here's the query.

    1. Go to the job search page: http://members.microsoft.com/careers/search/default.aspx.
    2. Leave Job Title at "All".
    3. Pick Location "WA - Redmond".
    4. Pick Job Categories "Program Management", "Software Development", and "Software Testing".
    5. What Product do you pick? Is Windows Communications the same as WCF? No. Is it part of the .NET Framework? No. All of the WCF jobs that I could find were filed as "(Not Product Specific)".
    6. In the Keywords field put "wcf csd", either with or without the quotes should work.

    That search should give you about 50 total jobs to look at.

  • Nicholas Allen's Indigo Blog

    More Poison Message Handling

    • 2 Comments

    We saw the poison message handling strategies for MSMQ 3 and MSMQ 4 yesterday, but how many different strategies can we come up with? Let us count the ways. I've roughly ordered these by increasing complexity.

    Discard. We could simply throw away any message that encounters a processing failure as soon as the failure occurs. This strategy completely solves the poison message retry problem. It is somewhat lacking in terms of practical utility for working with a queue. Some of the messages are not processed, but all of those that are processed are processed in order.

    Return to front. This is the default strategy that we were looking at for a queue before getting into MSMQ. Return to front has the problem that either a queue administrator regularly cleans out poison messages or we block forever trying to process a single bad message. All of the messages that we reach are processed in order. There's no guarantee of termination and no guarantee that a particular message will ever get processed.

    Return to back. We can flip the previous strategy around by moving messages to the end of the queue instead of putting them back on the top. This strategy is a bit trickier to implement because return to front is the natural behavior when rolling back a transaction. Return to back cannot rely on that behavior. All of the messages are processed but not in order. We'll eventually complete every non-poison message although there's still no guarantee of termination.

    Move to other queue. This is the MSMQ 3 strategy. We use a heuristic to decide when to give up on a message and stop processing it. All messages are either processed or saved for later. We'll eventually terminate but now somebody has to clean out the other queue.

    Shuttle between queues. This is the MSMQ 4 strategy. If we allow an infinite number of cycles, then this is equivalent to return to back. If we only allow a finite number of cycles, then this is equivalent to move to other queue. The advantage of this strategy is that we tend to spend a smaller proportion of time attempting poison messages.

    There are many other variations that we could start making on these patterns. For instance, we can vary the heuristic from number of times processed to time spent processing or some application mechanism. We can change move to other queue to move to resource, regardless of whether that resource is a queue, database, web service, workflow, or other application. I'm not trying to limit you into thinking that there's only 20 or 30 ways that you can handle a poison message.

    Next time: Stashing Data in Extensible Objects

  • Nicholas Allen's Indigo Blog

    MSMQ and Poison Messages

    • 1 Comments

    Last time we looked at the idea of poison messages in queues- messages that are permanently unprocessable. If we don't handle a poison message carefully, then we will be locked into a permanent cycle of requesting the message from the queue, failing to process the message, and returning the message to the queue. There are many different strategies that can be applied to reduce or eliminate the resource waste of these futile cycles.

    I'll first look at the primary strategy that MSMQ uses for poison messages. Tomorrow, we'll finish off the series by looking at a comparison of this approach with various other strategies.

    First, how do we detect that a message is permanently unprocessable? The application may be able to tell the queue this, but we can't guarantee that the application will recognize every potential poison message. We can approximate the idea of being permanently unprocessable by saying that there is a threshold for number of failures beyond which we expect this failure pattern to continue indefinitely. On the MSMQ binding, that threshold is controlled by the ReceiveRetryCount.

    Now, assume we've got a poison message. The most common way of dealing with a poison message is to move it out of the main queue and into some other queue where no processing will take place. That other queue is commonly called a dead-letter queue. Up to version 3 of MSMQ, the dead-letter queue is the end of the line. A downside of the dead-letter queue is that some administrator has to flush the messages out in case they can be corrected or in case we have accidently classified a temporary processing failure as a permanent processing failure.

    Version 4 of MSMQ introduces an optional intermediary step between the main queue and the dead-letter queue. Remember that while we were spinning doing retries on the poison message, other messages were not getting processed. The addition to the strategy is to have a retry queue. Like the dead-letter queue, the retry queue does not process messages. However, after a period of time, set by the RetryCycleDelay, the message gets moved from the retry queue back into the main queue. This process can repeat a number of times, up to the MaxRetryCycles. The retry queue allows us to attempt processing many more times without impeding the progress of the service as much.

    Next time: More Poison Message Handling

  • Nicholas Allen's Indigo Blog

    Poison Message Handling

    • 2 Comments

    I've got a few posts on queued and durable messaging coming up over the next few weeks, and we're going to need some vocabulary for those posts that hasn't been used yet while talking about web services. Today's article covers general background around the concept of "poison" messages.

    Web services without durability or reliability make no guarantee about preserving messages. When failure occurs during message processing, the web service may send back a fault describing that failure, but the original message that caused the fault is destroyed. You can layer some reliability on top of such a messaging system by making buffered copies of messages and using acknowledgments to indicate that processing is complete and the buffered copy can be destroyed.

    Buffering in memory doesn't really provide any durability because memory is a transient store. There's still no actual guarantee here that messages will be delivered.

    Now, suppose that the individual messages have a lot of value. The value could be an economic value, but the type of value isn't important for this description. We want to be rigorous now about making delivery guarantees to preserve that value. One way to implement the guarantee is to have a permanent, durable store and some atomic way of linking successful message processing together with deleting the message from the store. Let's call those pieces a queue and a transaction.

    There is a new problem with the durable service that the non-durable service did not face. In the error-handling case, we have unsuccessful message processing and therefore we do not delete the message from the store. The message will be picked out of the store again in the future to retry processing. If this was a transient processing error, then that behavior is exactly what we want. If this was a permanent processing error, perhaps because the message was malformed, we are going to be locked in a futile cycle of retrieving the message and unsuccessfully processing it. A lot of processing time is wasted making no progress.

    Poison messages are the idea of these permanently unprocessable messages. We need to take the poison message out of the queue and apply some strategy to it. A typical solution is to move the poison message to some other queue, where it will not be tying up the processing time of our main loop. Next time, we'll look at some of the options for poison message strategies used by MSMQ.

    Next time: MSMQ and Poison Messages

  • Nicholas Allen's Indigo Blog

    Durable is More than Duplex

    • 1 Comments

    Clemens and Shy pointed me at this article by Harry Pierson the other day. Since I was getting ready to present at a conference, I just now had time to read the article, and it is really good. If you have been confused in the past about the relationship between duplex contracts and durable services, then this article will solve that confusion. Harry nails the facts in addition to some analysis about why it took him a year to figure this out :)

  • Nicholas Allen's Indigo Blog

    You Can't Fake Correlation

    • 1 Comments

    How do I construct callbacks to work over a load balancer without affinity?

    Let's construct a scenario to demonstrate this question. I have three machines; call them X, Y, and Z. X and Y are together behind a network load balancer. This is a server to server communication scenario, where two servers are attempting to talk over a duplex contract.

    One of the load-balanced servers, X or Y, is going to first act as the client. Pretend that X is the relevant server in this case. X calls a service with a callback contract to Z. At some point in the future, Z is going to respond on that callback to the load-balanced group. If X passed its real address to Z, then Z has no problem making the callback. If X gives the load-balancer address, then Z will sometimes pick X and sometimes pick Y. The load balancer is not affinitized to a particular machine. The interesting case is where we haven't pinned X as the instance to respond to.

    What can we do to make sure that the request by X is correlated with the response by Z, regardless of whether that response goes to X or Y? Well, either one of two things needs to happen.

    • Z can stuff all of the necessary context information into the response message so that any server could process the response without having to know about the previous conversation. This is essentially turning a stateful problem into a stateless problem that sends a whole bunch more data. This has turned out to be a pretty interesting solution from the HTTP developer front.
    • X and Y can share a common, durable store of correlation information. This is typically a database, but we don't have to be specific about how X and Y share state between themselves.

    If you picked something in between going totally stateless and having durable state management, then there would be some interesting implications. There would be situations in which the receiving server would need to invent correlation information out of thin air in order to properly interpret the message. You can fake this some of the time, but sooner or later you'll get caught.

    Next time: Poison Message Handling

  • Nicholas Allen's Indigo Blog

    Transport Encryption and Signing

    • 3 Comments

    How do I control whether the transport signs and encrypts messages?

    This answer ties into the article I wrote a few weeks ago on describing channel security capabilities. If you don't remember about protection levels and security capabilities, then you should read that article first.

    The service and operation contract attributes include a field, called ProtectionLevel, for describing the minimal level of protection that should be applied to messages. If you have security in the channel stack and don't specify any settings, then the default is to both sign and encrypt messages. If the channel stack does not support the requested protection level, for instance HTTP supports neither encryption nor signing, then you'll get an exception saying that the binding you've chosen is incompatible with the specified security settings. If the channel stack does support that protection level, then you are guaranteed to receive at least the minimum level of protection on messages. What does that mean?

    Message security and transport channels are going to combine to provide at least the minimum level of protection. Let's make the picture simpler by saying that message security is not being used at all. We have a channel stack that just provides transport security. The transport security binding element has an additional configuration knob that lets you specify the target protection level, also labeled ProtectionLevel. Assume that we're being reasonable here and say that the transport protection level is at least the contract protection level. Then, the transport channel will attempt to provide a protection level that is no greater than the target protection level.

    Some transports do not have flexibility in the protection level that they provide. SSL security, such as with HTTPS, always provides encryption and signing. There's no way to throttle that security method back and so the transport protection level knob is ignored. Windows security, such as with TCP, does permit throttling the protection level. If the service contract specifies signing only, you're using TCP with Windows security, and you've set the transport protection level to signing only as well, then everything aligns for you to get signing only.

    Note that if you want neither signing nor encryption, then the easiest way to do this is to simply replace your transport with one that does not supply security.

    Next time: You Can't Fake Correlation

  • Nicholas Allen's Indigo Blog

    Actions for FaultExceptions

    • 2 Comments

    What should I set the action parameter to when creating a FaultException?

    There is indeed a pair of overloads for creating fault exceptions that take an action parameter, although most of the overloads lack this.

    public FaultException(TDetail detail, FaultReason reason, FaultCode code, string action);
    public FaultException(TDetail detail, string reason, FaultCode code, string action);

    What does the action parameter actually do? Well, this may or may not be obvious, but setting the action on the fault exception controls the action that is used when sending the fault message. This is the reason why you can't just make up an action here and expect it to work. The receiver is looking for a particular action to reconstitute the fault message to an exception with the appropriate type. If you break the action here, then your typed FaultException turns into an untyped FaultException.

    The expected action value for a particular typed FaultException comes from the fault contract. If you just set up the fault contract and don't worry at all about the action when creating fault exceptions, then everything should work. The fault exception will automatically pick up the correct action from the fault contract. The answer then is that you shouldn't set the action parameter at all in most cases.

    The default fault contract action is generated by combining a number of type strings. For instance, if my service contract is IService, my operation is called Action, and I'm using a typed FaultException<string> instance, then the default fault action is http://tempuri.org/IService/ActionStringFault. Similarly, if I'm instead using a typed FaultException<IList<string>>, then the default fault action is http://tempuri.org/IService/ActionIListOf_StringFault. You can get as crazy as you want and figure out what the expected pattern should be for any type. Want to send an IDictionary<IList<string>, IDictionary<DateTime, string>>? It will be http://tempuri.org/IService/ActionIDictionaryOf_IListOf_String_IDictionaryOf_DateTime_StringFault. Of course, you can explicitly put an action in the fault contract attribute to set this value to anything.

    Next time: Transport Encryption and Signing

  • Nicholas Allen's Indigo Blog

    Bindings for Workgroups

    • 3 Comments

    What's the fastest binding for securely communicating over an intranet? How about if the client and server don't share a domain?

    A lot of attention gets paid to Internet configurations, where HTTP rules the world. HTTP is so dominant in that environment because it is a very open and standardized protocol. Servers that support HTTP as their primary network transport protocol have a lot of reach. It's easy to write clients that connect to these servers, which means a lot more clients will get written than would be the case if the server used some obscure network transport protocol.

    The world is completely different on an intranet because suddenly reach is no longer a critical factor for adoption. It is possible to use both political and technical means to control the technology that the client and server commonly share. This sharing is helpful in a lot of ways because it allows the use of specialized network transport protocols that are faster and more efficient than the standardized protocols. By removing the requirement of reach, it is possible to do better at meeting other requirements, such as performance.

    In the general case, the fastest transport in WCF for communicating between machines is the TCP transport. The fastest encoding in WCF is the binary message encoder. Since we control the technology in this scenario, we can enforce support for these protocols. That combination is the default setting for the NetTcpBinding. However, NetTcp has other features turned on by default that take back some of these performance advantages. For example, leaving security enabled is going to roughly cut the network transfer performance of TCP in half. Security is an example of a desirable feature with significant cost, but allows you to get away with not paying if you don't need the feature. That's the essence of the "pay as you go" model.

    We need security in this case, but we can go with the lightest strategy for securing the connection. Without a trust relationship between the client and server, we can't rely on a third-party service to broker trust between us. The simple and direct approach in this case is to use NetTcp with transport security and rely on the plain old Windows NTLM authentication. NTLM is pretty cheap and allows us to use the basic username and password model for transferring data to a remote machine.

    Next time: Actions for FaultExceptions

  • Nicholas Allen's Indigo Blog

    Reducing Memory Usage with Large Messages

    • 2 Comments

    I'm working on an application that processes many large messages at the same time. The messages should all fit into memory, but I'm running out of memory much sooner than expected. How do I reduce the overhead associated with each message?

    I'm only going to talk about the case where the system is legitimately running out of memory sooner than expected.

    If you're trying to buffer messages that are larger than the memory you have available, then there is no clever trick to make those messages fit into memory. You will have to store all or part of the messages somewhere else, either by spooling to disk or by using streamed instead of buffered transfers.

    If you're exceeding not the available memory but rather the available addressing space, then this similarly requires more fundamentally drastic approaches. A 32-bit machine can give roughly two billion address spaces to an application. Finding more than a few hundred million of these address spaces that are both free and consecutively located is a challenge. As you repeatedly allocate and free memory, fragmenting the free areas of addressing space, the challenge becomes larger and larger. That is a different problem than what this article is purporting to solve.

    The specific case that is being talked about here is when the number of allocated buffer bytes consistently exceeds the number of actual message bytes by a significant margin. You may see this when, for example, you're processing a bunch of messages that occupy 12 megabytes of memory but sit inside of 16 megabyte buffers. This is the result of using a buffering system that is more appropriate for small message sizes.

    When processing lots of small messages, the cost of allocating and freeing buffers can be noticeable relative to the amount of work that you're doing. Transports have two knobs, MaxBufferSize and MaxBufferPoolSize, to control this allocation strategy. When you're dealing with lots of large messages, then this pooling strategy doesn't save you that much time and can cost you some memory proportional to the message size. By setting MaxBufferPoolSize to 0, you eliminate this allocation overhead and raise the number of messages that you can process before running out of memory. Remember though, unless your messages are large, disabling buffer pooling tends to hurt more than it helps.

    What are large messages? Certainly, if all of your messages are a few tens of megabytes in size, then you have large messages. If you regularly have messages that are a few tens of kilobytes in size, then you have small messages. If your messages are in-between or vary wildly in size, then you should experiment to see what is best for you.

    Next time: Bindings for Workgroups

Page 1 of 1 (20 items)