Another preview release (preview 3) for the Managed Extensibility Framework went live yesterday. MEF is a component system for building applications that use add-ins in a standard way. The most common demonstration of this so far has been for developing Visual Studio add-ins, but you can imagine many other applications that need to do similar kinds of things. It looks like this release has changed the container and export models a bit as well as added a persistent assembly cache.
The preview release includes the source for MEF (System.ComponentModel.Composition).
Since I'm on the topic of highly distributed, concurrent, and asynchronous programming I thought I'd mention some of the other research that is going on in this area. One of those research projects is the method of monadic programming. Monadic programming is an abstraction for function composition that helps reduce the mental juggling required to track functional side effects to shared state.
There are a few new videos on Channel 9 from Brian Beckman talking about monadic programming as well as a few old ones that give more background.
You'll probably want to have read the previous articles about correlation for this to make sense.
Now that you've seen some of the details about message queries, you can combine queries with Ed's demonstration of programming a correlation in workflow to infer how the system actually works.
In a workflow program, correlations are described by placing send and receive messaging activities that are associated by a shared correlation handle. Each of the messaging activities has attached correlation queries that help define the application protocol. The correlation queries are just examples of queries to be evaluated by the message query engine. Ed wrote all of the correlation queries using XPath expressions, which can be evaluated using an XPath implementation of the MessageQuery and MessageQueryTable classes.
As I mentioned, a hash is taken of the various message query results to create a correlation key. It's this key computation result that gets stored in the correlation handle variable. A correlation handle variable is typically used in a write-once fashion. The first person to touch a message (either sending or receiving) initializes the value slot of the correlation handle variable with the key computed from the message. Everyone else associates successive messages to the correlation handle by checking whether the key computed for the new message matches the value stored in the variable. There are also activities to explicitly initialize and finalize correlations so that state can be manipulated without having to exchange messages. For example, these explicit activities replace the need to construct dummy sends to initialize a correlation like in BizTalk.
Reality is a little more complicated than this for a number of reasons.
Many interesting pieces of information are not accessible from walking the XML representation of the message. A language such as XPath might have no native way of describing the location of this kind of information. In some cases, such as message headers, we're able to devise standardized ways of accessing the information. In other cases, the situation is too varied to standardize. Instead, a property promotion mechanism is needed to bring the information into a standard format. The correlation system has a property promotion mechanism that works through a message property attached to the message. Information that is added to the message property is accessible through an XPath extension function that is provided by the MessageQuery implementation.
Another complication is that the workflow runtime might not have enough information to run the queries. When sending a message, there may be promoted properties that aren't computed until the message is being sent. To handle these cases there must be some way to integrate the providers and consumers of correlation information. The property promotion mechanism additionally has a way to provide callbacks during the resolution of a correlation so that the correlation result can be computed and stored.
Although we've covered only a few of the classes related to correlation, this series has already gotten fairly long. I'll come back to correlation in the future after looking at more features for asynchronous and decoupled programming in WCF 4.0.
Since a message filter and message query share a similar heritage, let's start by looking at the conceptually simpler message filter APIs. You probably haven't seen message filters before unless you've gone out of your way to explore everything that comes with WCF. They don't appear in the ordinary use of web services.
A message filter is basically a matching delegate that works on WCF messages. There is not a lot of interesting message filter methods.
public abstract class MessageFilter{ protected internal virtual IMessageFilterTable<FilterData> CreateFilterTable<FilterData>(); public abstract bool Match(Message message); public abstract bool Match(MessageBuffer buffer);}
As you might have guessed, there are Match methods for both messages and message buffers. There's also this message filter table that you might not have expected. The message filter table is used to optimize execution of message filters. A message filter table contains a collection of related message filters and also has a slot for the application to store state data with each filter.
public class MessageFilterTable<TFilterData> : IMessageFilterTable<TFilterData>, IDictionary<MessageFilter, TFilterData>, ICollection<KeyValuePair<MessageFilter, TFilterData>>, IEnumerable<KeyValuePair<MessageFilter, TFilterData>>, IEnumerable{ public bool GetMatchingFilter(Message message, out MessageFilter filter); public bool GetMatchingFilter(MessageBuffer buffer, out MessageFilter filter); public bool GetMatchingFilters(Message message, ICollection<MessageFilter> results); public bool GetMatchingFilters(MessageBuffer buffer, ICollection<MessageFilter> results); public int GetPriority(MessageFilter filter);}
I've picked a subset of the methods that help show how you can use the message filter table abstraction to change how the message filters execute. For example, let's say that you've got a bunch of message filters in a message filter table and two of the filters share some common work. When you execute the batch of message filters through the message filter table interface, the shared common work would ideally only get executed once. Hiding the message filters behind the message filter table abstraction allows for these types of optimizations because the black box prevents an external observer from seeing the actual computations that get done. The priority scheme is just an addon to deal with the fact that multiple message filters may match the same message.
A message query looks almost the same as a message filter except that message queries generate results instead of matches.
public abstract class MessageQuery{ public virtual MessageQueryCollection CreateMessageQueryCollection(); public abstract TResult Evaluate<TResult>(Message message); public abstract TResult Evaluate<TResult>(MessageBuffer buffer);}
The message query table similarly replaces results with matches. I've again picked a subset of methods that demonstrate this. You'll also notice that message queries are throughout emphasizing a multiple match mode rather than the single match mode of the message filter table. This is due to the slightly different use cases that the two were designed for.
public class MessageQueryTable<TItem> : IDictionary<MessageQuery, TItem>{ public IEnumerable<KeyValuePair<MessageQuery, TResult>> Evaluate<TResult>(Message message); public IEnumerable<KeyValuePair<MessageQuery, TResult>> Evaluate<TResult>(MessageBuffer buffer);}
Last time I talked about how WCF 4.0 standardizes many different types of correlations using a query mechanism and promised to go into more detail today.
You might already be familiar with the message filter engine in WCF 3.0. If you haven't seen message filters before, then the message filter engine is just a way to check for matches in a message. For example, you might have an implementation of a message filter that uses XPath expressions and then create the filter /s:Envelope/s:Body/x:FooRequestMessage/y:OrderId to match SOAP messages with a particular structure. Instead of using an XPath to match the message content, you might instead have used an intrinsic function to match a message header or maybe even have described the filter in an entirely different language. Evaluating the message filter tells you whether the particular message satisfies the rules of the filter. The message filter engine manages a table of message filters so that the evaluation of many filters can be optimized.
For correlation, we created an equivalent to a message filter called a message query. You can think of a message query as a message filter that both checks for matches in a message as well as returns the value that was matched. For example, if you used the XPath message filter above, then it would tell you whether the message matched or not. If you used the equivalent XPath message query, then the result of the match would also give you the value of the y:OrderId node. Just like message filters, message queries are organized into tables by a message query engine to optimize the execution of many message queries at once.
You might see how to build correlations out of this query mechanism. The correlation query describes the structure or piece of information that you'll be correlating on. This information may be part of a message, a message header, or maybe even some function that gets run that doesn't depend on the message at all. Some of the queries will return matches and produce query results. These query results are then the value of the correlation that you'll be matching against. For example, your correlation query may pick out the y:OrderId node from a message if it exists. Let's say that it does and the node has the value 5. Then, the message query result is the value 5, and you can match that value against a table of previously observed values to correlate this message with some previously created state. Since you might have multiple pieces of information that are extracted, you want to have some way to hash and compare all of the matches at once. That hash is the correlation key that I talked about last time.
Next time we'll look at the APIs for message filters and message queries to see exactly how this process works.
One of the topics that you'll hear a lot about for asynchronous and decoupled programming in WCF 4.0 is correlation. Correlation is a relationship between one message and another message or one message and a piece of state. With synchronous programming, you may not always think about the correlations that are present. Correlations may be implicit through a call stack or through local variables. Or, they may be dramatically simplified by virtue of the fact that you know what's going to happen next and can craft your code to match the situation.
In asynchronous programming, correlation is the glue that joins together different operations. There are many different types of correlations. You've probably used several before.
For example, if you've defined a service contract with a CallbackContract, then you've specified a correlation based on two parties sharing a continuous network connection for an exchange of messages. If you've defined an HTTP cookie, then you've specified a correlation based on protocol information that is durably stored in between messaging operations. If you've included an ID field in a message, then you've specified a correlation based on message content. In many systems, each different type of correlation has a different way of describing and programming the correlation mechanism.
In WCF 4.0, one of the things we've done is think about how to standardize many different types of correlation behind a similar set of mechanisms. You'll see some examples using this correlation mechanism for workflow in the second half of Ed Pinto's PDC talk.
The basic pattern has just three operations.
It turns out that many complicated patterns for correlation can be thought of in terms of queries. Next time I'll show you a few examples so that you can see what I mean by this.
For a long time it's been true that the average personal computer is not well-suited to running highly available public facing services. In the original model for network mail delivery though, everyone that wanted to receive mail needed a local mail transfer service. The increasing use of individual workstations rather than shared servers made this problem particularly acute. Two protocol families emerged to deal with the problem and allow access to a remote mail store: the Post Office Protocol and the Interactive Mail Access Protocol.
The dominant mail protocol for quite a while ended up being the third version of the Post Office Protocol, also called POP3. Twenty years ago was the first release of the POP3 protocol, based on the earlier POP and POP2 protocols. POP3 interacts with a mail user agent running on a standalone workstation to manage mail stored by a mail server. You can read more about the details of the POP3 protocol in RFC 1081.
To give a comparison with mail systems at the time, the IMAP specification published a few months earlier described an implementation with limits of 6.75 MB for a mailbox and 18,432 total messages. Since these limits were much larger than the capabilities of existing mail programs, it was expected that no one had yet to encounter them.
Last week a preview training kit was posted for Visual Studio 2010 and .Net Framework 4.0. A training kit is a collection of presentations, labs, and demos that broadly demonstrates the features of a product. This training kit focuses on these upcoming releases but is rather light on WCF content. You'll have to wait until I continue my series on the future of WCF on Wednesday for that. As with past training kit previews, there will be updates as more content is available.
Having read part 1 will be helpful.
As I mentioned last time, there were two markets in particular that I thought were interesting for web service developers to expand into when using WCF.
The first market was REST and the HTTP application style of web development. Since the initial version of WCF I think that we've made a bit of progress in that direction although we still have quite a ways to go.
Before WCF shipped we managed to slip in a feature late in the development cycle to wash some of the SOAP off of messages: the None MessageVersion. This feature replaced a clumsier system that tried to map HTTP headers around and has turned out to be more generally useful since in non-HTTP systems. I've seen a lot of opportunities since then to make use of MessageVersion.None in a variety of contexts.
During Orcas we also made a large investment relative to the size of the release in making WCF friendlier to the web. Examples include special service hosts to reduce configuration, new bindings and message encoders, attributes for working with HTTP methods, URI templates, and so on. I think that this was a great set of improvements in WCF for web application development but we still have no direct accommodations for the REST architectural style. I hope that the REST starter kit proves out to be an initial down payment in that area. It will take some time and work before those features are ready to be incorporated into the main product. Until then, we should be able to experiment and iterate quite quickly because of the lightweight release format.
The other market that I took interest in was enterprise and application integration. Although many people viewed WCF as being enterprise-focused because of the prominence of the WS-* protocols, those type of protocols represent only one kind of enterprise scenario.
Many of the WS-* protocols make an implicit assumption of trust between the two parties to smoothly and cheaply coordinate work. That is not to say that they take no precautions in protocol or coordination to guard access and resources. However, the design of the protocols necessitates a minimum level of coupling to avoid problematic resource and timeliness issues.
For example, a distributed transaction between two parties requires resource allocations for the transaction state, an optimistic assumption by the transaction initiator that things will be resolved quickly, and a pessimistic assumption by the transaction receiver that the work must be performed soon rather than at the receiver's convenience. When operating in concert with a system that you do not control, some of these assumptions and arrangements may be ill-advised. As a result of these types of issues, many operations that are coordinated between two businesses, or within a sufficiently large business, are conducted in an asynchronous and decoupled fashion.
You'll be seeing a lot of features coming to improve the state of asynchronous and decoupled programming in WCF in future releases. Starting next time I'll talk about some of the ones we've announced in .NET 4.0.
It's been two years since we shipped the first release of WCF (codenamed Indigo). It was actually even a little before that that I started thinking about what features we should include in the upcoming .NET 4.0 release.
The first time that I wrote down a list of features that I was targeting for improvement was the summer before, approximately two and a half years ago. There is always a period of time towards the end of a release where every addition is scrutinized too closely to add a new piece of work but the old pieces of work no longer require the continuous attention of everyone on the team. This is typically one of those creative periods where people are doing things such as thinking about what seems at the time as the distant future. I'll have to go check how closely that initial list compares to what we are planning to do today. I suspect that the two will read very differently.
Due to the way that the Orcas release was conceived and assembled, many of the ideas we had could only be done in the next major release. This led to us essentially planning for both releases at once. Orcas became a targeted release because there were significant restrictions on both our time until delivery and about what we could to the existing code from an engineering standpoint.
One of the things that I took notice of early on is that there were some particular biases in the types of web services that WCF was built to support. I thought that there were other interesting markets for web service developers that WCF wasn't doing a good job of catering to at the time.
WCF started with a clear bias towards SOAP, WS-* protocols, and other heavyweight standards that were being popularized at the time but had not been ubiquitously adopted. One effort underway was to continue evangelizing and spreading these standards.
I didn't believe though that you could force the world to use a single approach for building web services. You also couldn't convince everyone to completely discard and replace all of the entrenched connected systems and applications that they had been developing and investing in for the past 25 years. Realistically, you can't convince anyone, except for a rounding error's worth of people, to do that without some tangible value in return. Instead, you have to look for markets where this weakness doesn't apply or where it can be turned into a strength.
There were two markets in particular that I thought were interesting for web service developers to expand into when using WCF, which I'll talk about next time.
Since I did the survey question on extensibility, I thought I'd do this followup on configuration. The two are often talked about together but have very different needs.
Question:
What's the one thing you would change about how configuration is done for WCF applications?
Pick anything you want about the configuration format, tooling, integration with service code, or other topics as you'd like, but try to keep in mind that this is for an application that uses WCF. It's ok if the details of the change have to do with IIS or other products though.
WCF has a whole lot of extensibility points. Many of those extensibility points use similar systems for describing and installing extensions, but overall you still end up with multiple ways of doing extensibility depending on what you're extending.
The Managed Extensibility Framework is a standardized plugin model for dynamically composing applications. If you haven't heard of MEF before, there's an introduction on the site, a short demo together with Visual Studio, and a longer explanation. These types of plugin frameworks seem to be increasingly popular as a way of extending the application development experience.
How would you expect a more uniform model for extensibility to make developing or using WCF applications easier, cheaper, or better?
(There are absolutely no plans at this time for using MEF with WCF or changing the extensibility model. I'm just curious what your impressions are.)
As a way to wrap up on PDC content I thought I'd do a bit of indexing to highlight the different areas of activity over the the next few years for general purpose web services development.
Service Frameworks
WCF 4.0: Building WCF Services with WF in Microsoft .NET 4.0 (Ed Pinto)
WF 4.0: A First Look (Kenny Wolf)
WF 4.0: Extending with Custom Activities (Matt Winkler)
WCF: Developing RESTful Services (Steve Maine)
WCF: Zen of Performance and Scale (Nicholas Allen)
ASP.NET MVC: A New Framework for Building Web Applications (Phil Haack)
Windows 7: Web Services in Native Code (Nikola Dudar)
Service Hosting
IIS 7.0 and Beyond: The Microsoft Web Platform Roadmap (Vijay Sen)
"Dublin": Hosting and Managing Workflows and Services in Windows Application Server (Dan Eshner)
Service Modeling
"Oslo": Customizing and Extending the Visual Design Experience (Don Box, Florian Voss)
A Lap around "Oslo" (Douglas Purdy, Vijaye Raji)
"Oslo": The Language (David Langworthy, Don Box)
"Oslo": Repository and Models (Chris Sells, Martin Gudgin)
"Oslo": Building Textual DSLs (Chris Anderson, Giovanni Della-Libera)
Service Security
Identity Roadmap for Software + Services (Bertocci Vittorio, Kim Cameron)
Identity: Live Identity Services Drilldown (Jorgen Thelin)
Identity: Connecting Active Directory to Microsoft Services (Lynn Ayres, Tore Sundelin)
Identity: "Geneva" Server and Framework Overview (Caleb Baker, Stuart Kwan)
Identity: "Geneva" Deep Dive (Jan Alexander)
Identity: Windows CardSpace "Geneva" Under the Hood (Rich Randall)
Services in the Cloud
A Lap Around the Azure Services Platform (John Shewchuk)
Architecture of the .NET Services (Dennis Pilarinos, John Shewchuk)
Live Services: A Lap around the Live Framework and Mesh Services (Ori Amiga)
Live Services: Building Applications with the Live Framework (Raymond Endres)
Live Services: Mesh Services Architecture and Concepts (Abolade Gbadegesin)
.NET Services: Messaging Services - Protocols, Protection, and How We Scale (Clemens Vasters)
Live Services: Live Framework Programming Model Architecture and Insights (Ori Amiga)
.NET Services: Orchestrating Services and Business Processes Using Cloud-Based Workflow (Moustafa Ahmed)
Live Services: Building Mesh-Enabled Web Applications Using the Live Framework (Arash Ghanaie-Sichanie)
Live Services: FeedSync and Mesh Synchronization Services (Steven Lees)
Live Services: Notifications, Awareness, and Communications (John Macintyre)
Live Services: The Future of the Device Mesh (Jeremy Mazner)
.NET Services: Connectivity, Messaging, Events, and Discovery with the Service Bus (Clemens Vasters)
.NET Services: Logging, Diagnosing, and Troubleshooting Applications Running Live in the Cloud (Mark Gilbert, Steve Garrity)
Sync Framework: Enterprise Data in the Cloud and on Devices (Liam Cavanagh)
Live Services: What I Learned Building My First Mesh Application (Don Gillett)
Live Services: Programming Live Services Using Non-Microsoft Technologies (Nishant Gupta)
Designing Your Application to Scale (Max Feingold)
Developing and Deploying Your First Windows Azure Service (Steve Marx)
Windows Azure: Architecting & Managing Cloud Services (Yousef Khalidi)
Windows Azure: Cloud Service Development Best Practices (Sriram Krishnan)
Windows Azure: Essential Cloud Storage Services (Brad Calder)
Windows Azure: Modeling Data for Efficient Access at Scale (Niranjan Nilakantan, Pablo Castro)
A Lap Around Windows Azure (Manuvir Das)
Windows Azure: Programming in the Cloud (Daniel Wang, Stefan Schackow)
Under the Hood: Inside the Windows Azure Hosting Environment (Chuck Lenzmeier, Frederick Smith)
.NET Services: Access Control Service Drilldown (Justin Smith)
.NET Services: Access Control In Microsoft .NET Services (Justin Smith)
Data Driven Services
SQL Services : Under the Hood (Gopal Kakivaya, Tony Petrossian)
SQL Server 2008: Developing Large Scale Web Applications and Services (Hala Al-Adwan, Jose Blakeley)
SQL Services: Futures (Patric McElroy)
A Lap around SQL Services (Soumitra Sengupta)
SQL Services: Tips and Tricks for High-Throughput Data-Driven Applications (David Robinson)
Developing Applications Using Data Services (Mike Flasko)
Offline-Enabled Data Services and Desktop Applications (Pablo Castro)
Now that we're winding down on 2008 conferences, I've started seeing more news coming about the events scheduled for next year.
MIX 2009 is being held March 18th to 20th in Las Vegas. Registration for the event is now open.
TechEd 2009 is being held May 11th to 15th in Los Angeles. After experimenting with holding separate conference tracks for developers and IT professionals, they're going back to the single conference arrangement. Registration for the event will be starting in December.
As a bit of a surprise, there are already plans announced for another PDC in 2009. PDC 2009 is being held November 17th to 20th in Los Angeles. Since that's more than a year away, there's absolutely no information about registration yet but you can sign up for the mailing list.
Now that I'm mostly caught up with reporting on sessions from PDC, I'll start talking a bit about the future of WCF that we announced during the conference.
The REST Starter Kit is actually something that is available now rather than coming in the next release of the framework. We had been thinking for a while about the state of REST support for the platform and decided to bring together guidance and samples for how to use this architectural style with WCF. The starter kit includes some experimental framework classes, Visual Studio templates, code samples, and documentation to better enable you to create applications in the REST style using WCF. Some of these resources are focused on REST itself, others are more broadly applicable to HTTP services, and still others deal with specific protocols that are often found together with REST applications, such as Atom.
There's no guarantee that any part of the starter kit will become part of the mainstream framework, but if you find something here that's useful, tell me about it so that we know which features you get the most value from. You can also tell me about things that we didn't include in the starter kit that you would have expected to see.
Now that most of the PDC session videos are up, here are links for the ones that I've talked about so far. I'm not going to update the previous posts but I'll put the links directly in future posts. It looks like the way Silverlight does streaming eats a browser connection for each video so open one or two at a time or the player won't start.
Under the Hood: Advances in the .NET Type System
The Future of C#
Deep Dive: Dynamic Languages in Microsoft .NET
WCF: Zen of Performance and Scale
Project "Velocity": Under the Hood
Live Services: Mesh Services Architecture and Concepts (video broken, slides are available)
(Presenters: Anil Nori and Murali Krishnaprasad)
What they said:
Learn about the architecture of Velocity, Microsoft's main memory distributed caching framework. Hear how Velocity was built to meet the performance, scale, latency, and availability requirements of large scale enterprise and web applications. Learn about Velocity components and discuss design tradeoffs and mechanisms for in-memory storage, data placement, and data replication for performance, scale, and availability. Also, hear how Velocity provides database capabilities like LINQ support, indexing, concurrency control, and data consistency.
What I said:
Velocity is a caching engine for low latency, high scale, and high availability of data between many readers and writers. The cache can either be organized using replication of the data among all the cache servers or by partitioning so that different parts of the data are available at different cache servers. The full cache is augmented with a local cache to improve performance for frequently retrieved data. Entries are stored in the local cache using a deserialized format and the local cache uses update notifications to synchronize with the full cache.
A partitioned cache is divided into primary and secondary storage of items. One of the cache machines is elected to hold a global partition map that other cache machines share. Routing tables use the partition map to direct lookup requests for a range of hashed keys. Data can be retrieved from its primary storage or by examining a quorum of secondary storage locations. If the primary storage fails, the partition manager designates a new primary storage location from among the secondary storage locations. Data have a sequence identifier to track updates and to implement update subscriptions and notifications. Subscribers poll cache machines for interesting event notifications. When a cache machine is promoted to be the primary storage location, the latest sequence number is checked so that another machine may be promoted instead if the originally selected location is out of date.
Caches are embeddable to give better performance in some scenarios at the cost of load balancing and optimized locality of data. Embedding generally works better for replicated caches than partitioned caches. The cache uses either pessimistic locking for updates or optimistic updates using the sequence identifier to avoid locking by doing retries. To control eviction of items from the cache, policies include expiration of items, least recently used eviction of items, and memory pressure based culling.
Caching is an important optimization for enterprise and web scale service implementations. Velocity is a caching engine for low latency, high scale, and high availability of data between many readers and writers. The current caching engine is hostable to build replicated, partitioned, and embedded caches for applications. This talk focuses on the implementation details of updating and retrieving items from the cache. In the future, the caching engine may potentially be exposed as a cloud caching service or support cloud applications that require caching.